changeset 769 | f9686b22db7e |
parent 766 | e8402d8ec8e6 |
768:34f77b976b88 | 769:f9686b22db7e |
---|---|
3 Welcome back. |
3 Welcome back. |
4 Remember this slide. |
4 Remember this slide. |
5 |
5 |
6 2 |
6 2 |
7 00:00:09,700 --> 00:00:11,500 |
7 00:00:09,700 --> 00:00:11,500 |
8 This slide said, What is |
8 This slide said what is |
9 |
9 |
10 3 |
10 3 |
11 00:00:11,500 --> 00:00:14,500 |
11 00:00:11,500 --> 00:00:14,500 |
12 all wreck expression Metro |
12 our regular expression matcher |
13 actually supposed to do? |
13 actually supposed to do? |
14 |
14 |
15 4 |
15 4 |
16 00:00:14,500 --> 00:00:16,570 |
16 00:00:14,500 --> 00:00:16,570 |
17 It will take two arguments and |
17 It will take two arguments, |
18 |
18 |
19 5 |
19 5 |
20 00:00:16,570 --> 00:00:18,670 |
20 00:00:16,570 --> 00:00:18,670 |
21 reg expression R and a string |
21 a regular expression r and a string s. |
22 |
22 |
23 6 |
23 6 |
24 00:00:18,670 --> 00:00:21,580 |
24 00:00:18,670 --> 00:00:21,580 |
25 S. And it's supposed to say yes, |
25 And it's supposed to say yes, |
26 |
26 |
27 7 |
27 7 |
28 00:00:21,580 --> 00:00:23,440 |
28 00:00:21,580 --> 00:00:23,440 |
29 the wreck expression matches |
29 the regular expression matches |
30 |
30 |
31 8 |
31 8 |
32 00:00:23,440 --> 00:00:26,920 |
32 00:00:23,440 --> 00:00:26,920 |
33 the string if and only |
33 the string if and only |
34 if the string is in |
34 if the string is in |
35 |
35 |
36 9 |
36 9 |
37 00:00:26,920 --> 00:00:28,855 |
37 00:00:26,920 --> 00:00:28,855 |
38 the language of R. |
38 the language of r. |
39 |
39 |
40 10 |
40 10 |
41 00:00:28,855 --> 00:00:32,410 |
41 00:00:28,855 --> 00:00:32,410 |
42 And if the string is not |
42 And if the string is not |
43 in the language of our, |
43 in the language of r, |
44 |
44 |
45 11 |
45 11 |
46 00:00:32,410 --> 00:00:35,515 |
46 00:00:32,410 --> 00:00:35,515 |
47 then our algorithm has to say no. |
47 then our algorithm has to say no. |
48 |
48 |
55 this specification |
55 this specification |
56 directly because remember, |
56 directly because remember, |
57 |
57 |
58 14 |
58 14 |
59 00:00:39,565 --> 00:00:43,305 |
59 00:00:39,565 --> 00:00:43,305 |
60 this l Sometimes |
60 this L sometimes |
61 produces infinite sets. |
61 produces infinite sets. |
62 |
62 |
63 15 |
63 15 |
64 00:00:43,305 --> 00:00:47,585 |
64 00:00:43,305 --> 00:00:47,585 |
65 And so we can test whether a |
65 And we can't test whether a |
66 string is an infinite set, |
66 string is in infinite set, |
67 |
67 |
68 16 |
68 16 |
69 00:00:47,585 --> 00:00:50,090 |
69 00:00:47,585 --> 00:00:50,090 |
70 at least not easily. |
70 at least not easily. |
71 |
71 |
83 clever and implement |
83 clever and implement |
84 some operations |
84 some operations |
85 |
85 |
86 20 |
86 20 |
87 00:00:57,050 --> 00:00:59,284 |
87 00:00:57,050 --> 00:00:59,284 |
88 on Rekha expressions instead. |
88 on regular expressions instead, |
89 |
89 |
90 21 |
90 21 |
91 00:00:59,284 --> 00:01:03,275 |
91 00:00:59,284 --> 00:01:03,275 |
92 Because Weka expressions |
92 because regular expressions |
93 are always finite trees. |
93 are always finite trees. |
94 |
94 |
95 22 |
95 22 |
96 00:01:03,275 --> 00:01:05,870 |
96 00:01:03,275 --> 00:01:05,870 |
97 I should say the |
97 I should say the |
98 person who has been |
98 person who has been |
99 |
99 |
100 23 |
100 23 |
101 00:01:05,870 --> 00:01:08,150 |
101 00:01:05,870 --> 00:01:08,150 |
102 clever is called brush-off ski. |
102 clever is called Brzozowski. |
103 |
103 |
104 24 |
104 24 |
105 00:01:08,150 --> 00:01:11,435 |
105 00:01:08,150 --> 00:01:11,435 |
106 It's his, I've written, |
106 It's his algorithm |
107 I'm introducing here. |
107 I'm introducing here. |
108 |
108 |
109 25 |
109 25 |
110 00:01:11,435 --> 00:01:15,515 |
110 00:01:11,435 --> 00:01:15,515 |
111 And his algorithm consists |
111 And his algorithm consists |
121 a regular expression as argument. |
121 a regular expression as argument. |
122 |
122 |
123 28 |
123 28 |
124 00:01:20,104 --> 00:01:24,155 |
124 00:01:20,104 --> 00:01:24,155 |
125 And the idea is that |
125 And the idea is that |
126 this function nullable. |
126 this function nullable is |
127 |
127 |
128 29 |
128 29 |
129 00:01:24,155 --> 00:01:26,450 |
129 00:01:24,155 --> 00:01:26,450 |
130 Testing whether |
130 testing whether |
131 the reg expression |
131 the regular expression |
132 |
132 |
133 30 |
133 30 |
134 00:01:26,450 --> 00:01:27,950 |
134 00:01:26,450 --> 00:01:27,950 |
135 can match the empty string. |
135 can match the empty string. |
136 |
136 |
147 00:01:33,275 --> 00:01:35,465 |
147 00:01:33,275 --> 00:01:35,465 |
148 So that's defined as false. |
148 So that's defined as false. |
149 |
149 |
150 34 |
150 34 |
151 00:01:35,465 --> 00:01:37,775 |
151 00:01:35,465 --> 00:01:37,775 |
152 This reg expression, |
152 This regular expression, |
153 the whole purpose |
153 the whole purpose |
154 |
154 |
155 35 |
155 35 |
156 00:01:37,775 --> 00:01:39,680 |
156 00:01:37,775 --> 00:01:39,680 |
157 is that it can match |
157 is that it can match |
161 00:01:39,680 --> 00:01:41,645 |
161 00:01:39,680 --> 00:01:41,645 |
162 So it's defined as true. |
162 So it's defined as true. |
163 |
163 |
164 37 |
164 37 |
165 00:01:41,645 --> 00:01:44,599 |
165 00:01:41,645 --> 00:01:44,599 |
166 If a reg expression |
166 If a regular expression |
167 can match a character, |
167 can match a character, |
168 |
168 |
169 38 |
169 38 |
170 00:01:44,599 --> 00:01:47,045 |
170 00:01:44,599 --> 00:01:47,045 |
171 then it cannot match |
171 then it cannot match |
180 If an alternative can |
180 If an alternative can |
181 match the empty string, |
181 match the empty string, |
182 |
182 |
183 41 |
183 41 |
184 00:01:53,180 --> 00:01:56,960 |
184 00:01:53,180 --> 00:01:56,960 |
185 then either or one can |
185 then either r1 can |
186 match the empty string. |
186 match the empty string, |
187 |
187 |
188 42 |
188 42 |
189 00:01:56,960 --> 00:01:59,720 |
189 00:01:56,960 --> 00:01:59,720 |
190 Or R2 can match the empty string. |
190 or r2 can match the empty string. |
191 |
191 |
192 43 |
192 43 |
193 00:01:59,720 --> 00:02:03,110 |
193 00:01:59,720 --> 00:02:03,110 |
194 So either nullable |
194 So either nullable |
195 of R1 has to hold, |
195 of r1 has to hold, |
196 |
196 |
197 44 |
197 44 |
198 00:02:03,110 --> 00:02:06,945 |
198 00:02:03,110 --> 00:02:06,945 |
199 or nullable of R2 has to hold. |
199 or nullable of r2 has to hold. |
200 |
200 |
201 45 |
201 45 |
202 00:02:06,945 --> 00:02:09,925 |
202 00:02:06,945 --> 00:02:09,925 |
203 In this sequence, it's |
203 In this sequence, it's |
204 the other way around. |
204 the other way around. |
205 |
205 |
206 46 |
206 46 |
207 00:02:09,925 --> 00:02:12,790 |
207 00:02:09,925 --> 00:02:12,790 |
208 If this reg expression can |
208 If this regular expression can |
209 match the empty string, |
209 match the empty string, |
210 |
210 |
211 47 |
211 47 |
212 00:02:12,790 --> 00:02:16,615 |
212 00:02:12,790 --> 00:02:16,615 |
213 then R1 has to be able to |
213 then r1 has to be able to |
214 match the empty string, |
214 match the empty string, |
215 |
215 |
216 48 |
216 48 |
217 00:02:16,615 --> 00:02:20,290 |
217 00:02:16,615 --> 00:02:20,290 |
218 and R2 has to be able to |
218 and r2 has to be able to |
219 match the empty string. |
219 match the empty string. |
220 |
220 |
221 49 |
221 49 |
222 00:02:20,290 --> 00:02:22,555 |
222 00:02:20,290 --> 00:02:22,555 |
223 So here it's just the opposite. |
223 So here it's just the opposite. |
224 |
224 |
225 50 |
225 50 |
226 00:02:22,555 --> 00:02:25,660 |
226 00:02:22,555 --> 00:02:25,660 |
227 In one case it is o and |
227 In one case it is or and |
228 the other case it's end. |
228 the other case it is and. |
229 |
229 |
230 51 |
230 51 |
231 00:02:25,660 --> 00:02:27,970 |
231 00:02:25,660 --> 00:02:27,970 |
232 In the store record |
232 The star regular |
233 expression can match |
233 expression can match |
234 |
234 |
235 52 |
235 52 |
236 00:02:27,970 --> 00:02:30,445 |
236 00:02:27,970 --> 00:02:30,445 |
237 always the empty string. |
237 always the empty string. |
242 So this is a simple |
242 So this is a simple |
243 recursive function |
243 recursive function |
244 |
244 |
245 54 |
245 54 |
246 00:02:33,340 --> 00:02:37,179 |
246 00:02:33,340 --> 00:02:37,179 |
247 and should not be too |
247 and it should not be too |
248 difficult to implement. |
248 difficult to implement. |
249 |
249 |
250 55 |
250 55 |
251 00:02:37,179 --> 00:02:42,025 |
251 00:02:37,179 --> 00:02:42,025 |
252 Okay, this nullable function |
252 Okay, this nullable function |
271 If they have some problems |
271 If they have some problems |
272 with runtime, for example, |
272 with runtime, for example, |
273 |
273 |
274 60 |
274 60 |
275 00:02:57,305 --> 00:02:58,940 |
275 00:02:57,305 --> 00:02:58,940 |
276 we can't expect that as |
276 we can't expect that |
277 |
277 |
278 61 |
278 61 |
279 00:02:58,940 --> 00:03:03,260 |
279 00:02:58,940 --> 00:03:03,260 |
280 simple fix will solve all |
280 a simple fix will solve all |
281 the problems in the world. |
281 the problems in the world. |
282 |
282 |
283 62 |
283 62 |
284 00:03:03,260 --> 00:03:06,530 |
284 00:03:03,260 --> 00:03:06,530 |
285 So I admit the second |
285 So I admit the second |
291 that you understand it. |
291 that you understand it. |
292 |
292 |
293 64 |
293 64 |
294 00:03:10,085 --> 00:03:12,140 |
294 00:03:10,085 --> 00:03:12,140 |
295 And it also just |
295 And it also just |
296 chose this preserves |
296 shows this Brzozowski |
297 |
297 |
298 65 |
298 65 |
299 00:03:12,140 --> 00:03:14,345 |
299 00:03:12,140 --> 00:03:14,345 |
300 the is a very clever guy. |
300 is a very clever guy. |
301 |
301 |
302 66 |
302 66 |
303 00:03:14,345 --> 00:03:15,800 |
303 00:03:14,345 --> 00:03:15,800 |
304 Actually, I have to say he was |
304 Actually, I have to say he was |
305 |
305 |
333 function is the following. |
333 function is the following. |
334 |
334 |
335 73 |
335 73 |
336 00:03:34,685 --> 00:03:38,120 |
336 00:03:34,685 --> 00:03:38,120 |
337 Imagine you have a |
337 Imagine you have a |
338 reexpression and it can |
338 regular expression and it can |
339 |
339 |
340 74 |
340 74 |
341 00:03:38,120 --> 00:03:41,930 |
341 00:03:38,120 --> 00:03:41,930 |
342 match a string of the |
342 match a string of the |
343 form C followed by as. |
343 form c followed by s. |
344 |
344 |
345 75 |
345 75 |
346 00:03:41,930 --> 00:03:44,810 |
346 00:03:41,930 --> 00:03:44,810 |
347 So the C is the first |
347 So the c is the first |
348 character of that string. |
348 character of that string. |
349 |
349 |
350 76 |
350 76 |
351 00:03:44,810 --> 00:03:48,605 |
351 00:03:44,810 --> 00:03:48,605 |
352 So I mentioned that can |
352 So I imagine that r can |
353 match this kind of string. |
353 match this kind of string. |
354 |
354 |
355 77 |
355 77 |
356 00:03:48,605 --> 00:03:50,330 |
356 00:03:48,605 --> 00:03:50,330 |
357 The question is now, |
357 The question is now, |
358 |
358 |
359 78 |
359 78 |
360 00:03:50,330 --> 00:03:52,400 |
360 00:03:50,330 --> 00:03:52,400 |
361 what would a wrecker |
361 what would a regular |
362 expression look |
362 expression look |
363 |
363 |
364 79 |
364 79 |
365 00:03:52,400 --> 00:03:54,695 |
365 00:03:52,400 --> 00:03:54,695 |
366 like that can match chest |
366 like that can match just |
367 |
367 |
368 80 |
368 80 |
369 00:03:54,695 --> 00:03:57,140 |
369 00:03:54,695 --> 00:03:57,140 |
370 s. You probably remember |
370 s. You probably remember |
371 |
371 |
372 81 |
372 81 |
373 00:03:57,140 --> 00:03:59,300 |
373 00:03:57,140 --> 00:03:59,300 |
374 that from the semantic |
374 that from the semantic |
375 that every relative, |
375 derivative, |
376 |
376 |
377 82 |
377 82 |
378 00:03:59,300 --> 00:04:00,860 |
378 00:03:59,300 --> 00:04:00,860 |
379 there was also |
379 there was also |
380 something of chopping |
380 something of chopping |
384 off the first character. |
384 off the first character. |
385 |
385 |
386 84 |
386 84 |
387 00:04:02,210 --> 00:04:04,940 |
387 00:04:02,210 --> 00:04:04,940 |
388 Here it's the same. |
388 Here it's the same. |
389 If a reg expression |
389 If a regular expression |
390 |
390 |
391 85 |
391 85 |
392 00:04:04,940 --> 00:04:07,835 |
392 00:04:04,940 --> 00:04:07,835 |
393 can match a string |
393 can match a string |
394 starting with a C, |
394 starting with a c, |
395 |
395 |
396 86 |
396 86 |
397 00:04:07,835 --> 00:04:11,090 |
397 00:04:07,835 --> 00:04:11,090 |
398 we're looking for a wreck |
398 we're looking for a regular |
399 expression which can match |
399 expression which can match |
400 |
400 |
401 87 |
401 87 |
402 00:04:11,090 --> 00:04:15,215 |
402 00:04:11,090 --> 00:04:15,215 |
403 the rest of the string where |
403 the rest of the string where |
408 And this operation will be called |
408 And this operation will be called |
409 |
409 |
410 89 |
410 89 |
411 00:04:17,690 --> 00:04:21,049 |
411 00:04:17,690 --> 00:04:21,049 |
412 the derivative of a |
412 the derivative of a |
413 wreck expression. |
413 regular expression. |
414 |
414 |
415 90 |
415 90 |
416 00:04:21,049 --> 00:04:22,205 |
416 00:04:21,049 --> 00:04:22,205 |
417 And it will also take |
417 And it will also take |
418 |
418 |
419 91 |
419 91 |
420 00:04:22,205 --> 00:04:25,460 |
420 00:04:22,205 --> 00:04:25,460 |
421 a character as argument |
421 a character as argument |
422 and the rec expression. |
422 and a regular expression. |
423 |
423 |
424 92 |
424 92 |
425 00:04:25,460 --> 00:04:28,730 |
425 00:04:25,460 --> 00:04:28,730 |
426 And in contrast to |
426 And in contrast to |
427 the semantic records, |
427 the semantic |
428 |
428 |
429 93 |
429 93 |
430 00:04:28,730 --> 00:04:31,310 |
430 00:04:28,730 --> 00:04:31,310 |
431 semantic derivative, which works |
431 derivative, which works |
432 |
432 |
433 94 |
433 94 |
434 00:04:31,310 --> 00:04:34,430 |
434 00:04:31,310 --> 00:04:34,430 |
435 over languages or |
435 over languages or |
436 sets of strings. |
436 sets of strings, |
437 |
437 |
438 95 |
438 95 |
439 00:04:34,430 --> 00:04:39,620 |
439 00:04:34,430 --> 00:04:39,620 |
440 This derivative works |
440 this derivative works |
441 over regular expressions. |
441 over regular expressions. |
442 |
442 |
443 96 |
443 96 |
444 00:04:39,620 --> 00:04:41,330 |
444 00:04:39,620 --> 00:04:41,330 |
445 Okay, here's this function. |
445 Okay, here's this function. |
448 00:04:41,330 --> 00:04:43,970 |
448 00:04:41,330 --> 00:04:43,970 |
449 It's defined recursively over |
449 It's defined recursively over |
450 |
450 |
451 98 |
451 98 |
452 00:04:43,970 --> 00:04:46,370 |
452 00:04:43,970 --> 00:04:46,370 |
453 the structure of rec expressions. |
453 the structure of regular expressions. |
454 |
454 |
455 99 |
455 99 |
456 00:04:46,370 --> 00:04:48,814 |
456 00:04:46,370 --> 00:04:48,814 |
457 The first argument |
457 The first argument |
458 is the character, |
458 is the character, |
459 |
459 |
460 100 |
460 100 |
461 00:04:48,814 --> 00:04:52,700 |
461 00:04:48,814 --> 00:04:52,700 |
462 and the second one is |
462 and the second one is |
463 a wreck expression. |
463 a regular expression. |
464 |
464 |
465 101 |
465 101 |
466 00:04:52,700 --> 00:04:56,510 |
466 00:04:52,700 --> 00:04:56,510 |
467 And remember the idea |
467 And remember the idea |
468 is we're looking for |
468 is we're looking for |
469 |
469 |
470 102 |
470 102 |
471 00:04:56,510 --> 00:05:00,680 |
471 00:04:56,510 --> 00:05:00,680 |
472 a wreck expression that |
472 a regular expression that |
473 can match everything. |
473 can match everything |
474 |
474 |
475 103 |
475 103 |
476 00:05:00,680 --> 00:05:03,125 |
476 00:05:00,680 --> 00:05:03,125 |
477 This reg expression |
477 this regular expression |
478 as argument can match |
478 as argument can match |
479 |
479 |
480 104 |
480 104 |
481 00:05:03,125 --> 00:05:07,040 |
481 00:05:03,125 --> 00:05:07,040 |
482 except for the C. So now let's |
482 except for the c. So now let's |
483 look at this first case. |
483 look at this first case. |
484 |
484 |
485 105 |
485 105 |
486 00:05:07,040 --> 00:05:10,550 |
486 00:05:07,040 --> 00:05:10,550 |
487 So this reg expression |
487 So this reg expression |
492 So it certainly cannot |
492 So it certainly cannot |
493 match any string starting |
493 match any string starting |
494 |
494 |
495 107 |
495 107 |
496 00:05:14,270 --> 00:05:16,910 |
496 00:05:14,270 --> 00:05:16,910 |
497 a C. So we have to look |
497 a c. So we have to look |
498 |
498 |
499 108 |
499 108 |
500 00:05:16,910 --> 00:05:20,090 |
500 00:05:16,910 --> 00:05:20,090 |
501 for and reg expression which |
501 for a regular expression which |
502 can not much anything. |
502 can not match anything. |
503 |
503 |
504 109 |
504 109 |
505 00:05:20,090 --> 00:05:22,700 |
505 00:05:20,090 --> 00:05:22,700 |
506 Well, that's the purpose |
506 Well, that's the purpose |
507 of this record expression, |
507 of this regular expression, |
508 |
508 |
509 110 |
509 110 |
510 00:05:22,700 --> 00:05:24,815 |
510 00:05:22,700 --> 00:05:24,815 |
511 so we define it as 0. |
511 so we define it as 0. |
512 |
512 |
513 111 |
513 111 |
514 00:05:24,815 --> 00:05:28,130 |
514 00:05:24,815 --> 00:05:28,130 |
515 In the next case, |
515 In the next case, |
516 this reg expression |
516 this regular expression |
517 |
517 |
518 112 |
518 112 |
519 00:05:28,130 --> 00:05:30,440 |
519 00:05:28,130 --> 00:05:30,440 |
520 can match the empty string, |
520 can match the empty string, |
521 |
521 |
524 but it cannot match |
524 but it cannot match |
525 any string that starts |
525 any string that starts |
526 |
526 |
527 114 |
527 114 |
528 00:05:33,440 --> 00:05:36,350 |
528 00:05:33,440 --> 00:05:36,350 |
529 with C. So also in this case, |
529 with c. So also in this case, |
530 |
530 |
531 115 |
531 115 |
532 00:05:36,350 --> 00:05:39,560 |
532 00:05:36,350 --> 00:05:39,560 |
533 we just define it as |
533 we just define it as |
534 the bracket question, |
534 the regular expression, |
535 |
535 |
536 116 |
536 116 |
537 00:05:39,560 --> 00:05:41,615 |
537 00:05:39,560 --> 00:05:41,615 |
538 which cannot match anything. |
538 which cannot match anything. |
539 |
539 |
540 117 |
540 117 |
541 00:05:41,615 --> 00:05:45,170 |
541 00:05:41,615 --> 00:05:45,170 |
542 In the next case, this |
542 In the next case, this |
543 C as the argument to |
543 c is the argument to |
544 |
544 |
545 118 |
545 118 |
546 00:05:45,170 --> 00:05:48,335 |
546 00:05:45,170 --> 00:05:48,335 |
547 the derivative and this d |
547 the derivative and this d |
548 is the Rekha expression. |
548 is the regular expression. |
549 |
549 |
550 119 |
550 119 |
551 00:05:48,335 --> 00:05:50,225 |
551 00:05:48,335 --> 00:05:50,225 |
552 So now there are two cases. |
552 So now there are two cases. |
553 |
553 |
554 120 |
554 120 |
555 00:05:50,225 --> 00:05:53,525 |
555 00:05:50,225 --> 00:05:53,525 |
556 If this C and this D |
556 If this c and this d |
557 is actually equal. |
557 is actually equal, |
558 |
558 |
559 121 |
559 121 |
560 00:05:53,525 --> 00:05:55,970 |
560 00:05:53,525 --> 00:05:55,970 |
561 That means this record |
561 Thaat means this regular |
562 expression can match |
562 expression can match |
563 |
563 |
564 122 |
564 122 |
565 00:05:55,970 --> 00:05:59,240 |
565 00:05:55,970 --> 00:05:59,240 |
566 a string which just contains C0. |
566 a string which just contains c. |
567 |
567 |
568 123 |
568 123 |
569 00:05:59,240 --> 00:06:01,505 |
569 00:05:59,240 --> 00:06:01,505 |
570 So if we strip that off, |
570 So if we strip that off, |
571 |
571 |
572 124 |
572 124 |
573 00:06:01,505 --> 00:06:04,790 |
573 00:06:01,505 --> 00:06:04,790 |
574 motor remains is |
574 what remains is |
575 the empty string. |
575 the empty string. |
576 |
576 |
577 125 |
577 125 |
578 00:06:04,790 --> 00:06:06,890 |
578 00:06:04,790 --> 00:06:06,890 |
579 What is a regular expression |
579 What is a regular expression |
584 match the empty string. |
584 match the empty string. |
585 |
585 |
586 127 |
586 127 |
587 00:06:09,170 --> 00:06:11,915 |
587 00:06:09,170 --> 00:06:11,915 |
588 Well, that's the |
588 Well, that's the |
589 purpose of the warm. |
589 purpose of the 1. |
590 |
590 |
591 128 |
591 128 |
592 00:06:11,915 --> 00:06:15,440 |
592 00:06:11,915 --> 00:06:15,440 |
593 And if c is not equal to d, |
593 And if c is not equal to d, |
594 |
594 |
595 129 |
595 129 |
596 00:06:15,440 --> 00:06:17,630 |
596 00:06:15,440 --> 00:06:17,630 |
597 then this reg expression |
597 then this regular expression |
598 |
598 |
599 130 |
599 130 |
600 00:06:17,630 --> 00:06:19,220 |
600 00:06:17,630 --> 00:06:19,220 |
601 cannot match anything |
601 cannot match anything |
602 which starts with |
602 which starts with |
603 |
603 |
604 131 |
604 131 |
605 00:06:19,220 --> 00:06:23,780 |
605 00:06:19,220 --> 00:06:23,780 |
606 a C. So again it will |
606 a c. So again it will |
607 be defined as just 0. |
607 be defined as just 0. |
608 |
608 |
609 132 |
609 132 |
610 00:06:23,780 --> 00:06:29,390 |
610 00:06:23,780 --> 00:06:29,390 |
611 Okay? Now, the alternative case, |
611 Now, the alternative case, |
612 |
612 |
613 133 |
613 133 |
614 00:06:29,390 --> 00:06:31,970 |
614 00:06:29,390 --> 00:06:31,970 |
615 if this reg expression can |
615 if this regular expression can |
616 |
616 |
617 134 |
617 134 |
618 00:06:31,970 --> 00:06:36,050 |
618 00:06:31,970 --> 00:06:36,050 |
619 match the strings |
619 match strings |
620 starting with C, |
620 starting with c, |
621 |
621 |
622 135 |
622 135 |
623 00:06:36,050 --> 00:06:40,820 |
623 00:06:36,050 --> 00:06:40,820 |
624 then it can either be |
624 then it can either be |
625 matched by the Ongwen. |
625 matched by the r1 |
626 |
626 |
627 136 |
627 136 |
628 00:06:40,820 --> 00:06:44,495 |
628 00:06:40,820 --> 00:06:44,495 |
629 Or it can be much by the R2. |
629 or it can be matched by the r2. |
630 |
630 |
631 137 |
631 137 |
632 00:06:44,495 --> 00:06:50,090 |
632 00:06:44,495 --> 00:06:50,090 |
633 If they are one can match C |
633 If the r1 can match c |
634 and then following a string, |
634 and then following a string, |
635 |
635 |
636 138 |
636 138 |
637 00:06:50,090 --> 00:06:53,975 |
637 00:06:50,090 --> 00:06:53,975 |
638 then we just have to recursively |
638 then we just have to recursively |
639 call the derivative for |
639 call the derivative for |
640 |
640 |
641 139 |
641 139 |
642 00:06:53,975 --> 00:06:56,570 |
642 00:06:53,975 --> 00:06:56,570 |
643 R to get a reg expression |
643 r1 to get a regular expression |
644 |
644 |
645 140 |
645 140 |
646 00:06:56,570 --> 00:06:59,135 |
646 00:06:56,570 --> 00:06:59,135 |
647 that can match the |
647 that can match the |
648 rest of the string. |
648 rest of the string. |
649 |
649 |
650 141 |
650 141 |
651 00:06:59,135 --> 00:07:02,690 |
651 00:06:59,135 --> 00:07:02,690 |
652 And the same we can do with R2. |
652 And the same we can do with r2. |
653 |
653 |
654 142 |
654 142 |
655 00:07:02,690 --> 00:07:06,110 |
655 00:07:02,690 --> 00:07:06,110 |
656 You can find a reg expression |
656 You can find a regular expression |
657 which can match everything. |
657 which can match everything |
658 |
658 |
659 143 |
659 143 |
660 00:07:06,110 --> 00:07:07,850 |
660 00:07:06,110 --> 00:07:07,850 |
661 This R2 can match, |
661 this r2 can match, |
662 |
662 |
663 144 |
663 144 |
664 00:07:07,850 --> 00:07:09,710 |
664 00:07:07,850 --> 00:07:09,710 |
665 starting with C, bad, |
665 starting with c, but |
666 |
666 |
667 145 |
667 145 |
668 00:07:09,710 --> 00:07:12,590 |
668 00:07:09,710 --> 00:07:12,590 |
669 which chopping off this C. Okay? |
669 we are chopping off this c. |
670 |
670 |
671 146 |
671 146 |
672 00:07:12,590 --> 00:07:16,370 |
672 00:07:12,590 --> 00:07:16,370 |
673 So now if you have to find |
673 So now if you have to find |
674 the break expression, |
674 the regular expression, |
675 |
675 |
676 147 |
676 147 |
677 00:07:16,370 --> 00:07:20,030 |
677 00:07:16,370 --> 00:07:20,030 |
678 which can match all the strings |
678 which can match all the strings |
679 where C is tripped off. |
679 where c is chopped off, |
680 |
680 |
681 148 |
681 148 |
682 00:07:20,030 --> 00:07:22,295 |
682 00:07:20,030 --> 00:07:22,295 |
683 Then we just have to |
683 then we just have to |
684 take the alternatives |
684 take the alternative |
685 |
685 |
686 149 |
686 149 |
687 00:07:22,295 --> 00:07:24,965 |
687 00:07:22,295 --> 00:07:24,965 |
688 of these two derivatives. |
688 of these two derivatives. |
689 |
689 |
690 150 |
690 150 |
691 00:07:24,965 --> 00:07:29,390 |
691 00:07:24,965 --> 00:07:29,390 |
692 Okay? Now to sequence case, |
692 Now to sequence case. |
693 |
693 |
694 151 |
694 151 |
695 00:07:29,390 --> 00:07:33,005 |
695 00:07:29,390 --> 00:07:33,005 |
696 this sequence case is the |
696 This sequence case is the |
697 most complicated one. |
697 most complicated one. |
698 |
698 |
699 152 |
699 152 |
700 00:07:33,005 --> 00:07:35,180 |
700 00:07:33,005 --> 00:07:35,180 |
701 And let's look first at |
701 And let's look first at |
702 |
702 |
703 153 |
703 153 |
704 00:07:35,180 --> 00:07:39,335 |
704 00:07:35,180 --> 00:07:39,335 |
705 the second case where |
705 the second case, in the |
706 the Earth's brush. |
706 else branch. |
707 |
707 |
708 154 |
708 154 |
709 00:07:39,335 --> 00:07:42,830 |
709 00:07:39,335 --> 00:07:42,830 |
710 Okay? So if this |
710 So if this |
711 regular expression |
711 regular expression |
712 |
712 |
713 155 |
713 155 |
714 00:07:42,830 --> 00:07:46,145 |
714 00:07:42,830 --> 00:07:46,145 |
715 can match a string |
715 can match a string |
716 starting with C, |
716 starting with c, |
717 |
717 |
718 156 |
718 156 |
719 00:07:46,145 --> 00:07:48,155 |
719 00:07:46,145 --> 00:07:48,155 |
720 then the following |
720 then the following |
721 must have happened. |
721 must have happened. |
722 |
722 |
723 157 |
723 157 |
724 00:07:48,155 --> 00:07:51,905 |
724 00:07:48,155 --> 00:07:51,905 |
725 First, the R1 must have matched |
725 First, the r1 must have matched |
726 a string starting with C |
726 a string starting with c |
727 |
727 |
728 158 |
728 158 |
729 00:07:51,905 --> 00:07:56,420 |
729 00:07:51,905 --> 00:07:56,420 |
730 and then anything coming |
730 and then anything coming |
731 afterwards with r2. |
731 afterwards with r2. |
732 |
732 |
733 159 |
733 159 |
734 00:07:56,420 --> 00:07:59,660 |
734 00:07:56,420 --> 00:07:59,660 |
735 Okay? So in this case I only |
735 So in this case I only |
736 |
736 |
737 160 |
737 160 |
738 00:07:59,660 --> 00:08:03,260 |
738 00:07:59,660 --> 00:08:03,260 |
739 have to call this |
739 have to call this |
740 derivative for this r one. |
740 derivative for this r1, |
741 |
741 |
742 161 |
742 161 |
743 00:08:03,260 --> 00:08:06,830 |
743 00:08:03,260 --> 00:08:06,830 |
744 And find the reg expression |
744 and find the regular expression |
745 which can match everything |
745 which can match everything |
746 |
746 |
747 162 |
747 162 |
748 00:08:06,830 --> 00:08:11,555 |
748 00:08:06,830 --> 00:08:11,555 |
749 this R one can match except |
749 this r1 can match except |
750 for this C chopped off. |
750 for this c chopped off. |
751 |
751 |
752 163 |
752 163 |
753 00:08:11,555 --> 00:08:15,830 |
753 00:08:11,555 --> 00:08:15,830 |
754 And I have to build that |
754 And I have to build the |
755 sequence derivative of that. |
755 sequence derivative of that. |
756 |
756 |
757 164 |
757 164 |
758 00:08:15,830 --> 00:08:18,260 |
758 00:08:15,830 --> 00:08:18,260 |
759 So that's what the As per se. |
759 So that's what the else branch says: |
760 |
760 |
761 165 |
761 165 |
762 00:08:18,260 --> 00:08:21,860 |
762 00:08:18,260 --> 00:08:21,860 |
763 So I take the |
763 I take the |
764 derivative of R1 and I |
764 derivative of r1 and I |
765 |
765 |
766 166 |
766 166 |
767 00:08:21,860 --> 00:08:23,480 |
767 00:08:21,860 --> 00:08:23,480 |
768 put the R2 on |
768 put the r2 on |
769 |
769 |
770 167 |
770 167 |
771 00:08:23,480 --> 00:08:25,865 |
771 00:08:23,480 --> 00:08:25,865 |
772 the back because that's |
772 the back because that's |
773 the rest of the string. |
773 the rest of the string. |
774 |
774 |
775 168 |
775 168 |
776 00:08:25,865 --> 00:08:29,240 |
776 00:08:25,865 --> 00:08:29,240 |
777 Ok? So that's the only |
777 So that's the only |
778 case we have to consider |
778 case we have to consider |
779 |
779 |
780 169 |
780 169 |
781 00:08:29,240 --> 00:08:32,750 |
781 00:08:29,240 --> 00:08:32,750 |
782 this sequence case |
782 this sequence case |
783 except if not the, |
783 except if not the |
784 |
784 |
785 170 |
785 170 |
786 00:08:32,750 --> 00:08:37,895 |
786 00:08:32,750 --> 00:08:37,895 |
787 how one can match the |
787 r1 can match the |
788 sea and something else. |
788 c and something else. |
789 |
789 |
790 171 |
790 171 |
791 00:08:37,895 --> 00:08:42,965 |
791 00:08:37,895 --> 00:08:42,965 |
792 But if on mismatching |
792 But if r1 is matching |
793 actually the empty string, |
793 actually the empty string. |
794 |
794 |
795 172 |
795 172 |
796 00:08:42,965 --> 00:08:48,890 |
796 00:08:42,965 --> 00:08:48,890 |
797 this case actually the R two |
797 In this case actually the r2 |
798 is in charge of matching |
798 is in charge of matching |
799 |
799 |
800 173 |
800 173 |
801 00:08:48,890 --> 00:08:51,590 |
801 00:08:48,890 --> 00:08:51,590 |
802 the string starting with C. So in |
802 the string starting with c. So in |
803 |
803 |
804 174 |
804 174 |
805 00:08:51,590 --> 00:08:55,490 |
805 00:08:51,590 --> 00:08:55,490 |
806 this case we have to call |
806 this case we have to call |
807 the derivative for R2. |
807 the derivative for r2. |
808 |
808 |
809 175 |
809 175 |
810 00:08:55,490 --> 00:08:57,875 |
810 00:08:55,490 --> 00:08:57,875 |
811 So that's why we have |
811 So that's why we have |
812 these two cases. |
812 these two cases. |
813 |
813 |
814 176 |
814 176 |
815 00:08:57,875 --> 00:09:00,455 |
815 00:08:57,875 --> 00:09:00,455 |
816 So if R1 is nullable, |
816 So if r1 is nullable, |
817 |
817 |
818 177 |
818 177 |
819 00:09:00,455 --> 00:09:03,245 |
819 00:09:00,455 --> 00:09:03,245 |
820 then it can match |
820 then it can match |
821 the empty string. |
821 the empty string. |
825 And we have to |
825 And we have to |
826 consider the case that |
826 consider the case that |
827 |
827 |
828 179 |
828 179 |
829 00:09:05,330 --> 00:09:08,045 |
829 00:09:05,330 --> 00:09:08,045 |
830 this R2 is matching a string |
830 this r2 is matching a string |
831 |
831 |
832 180 |
832 180 |
833 00:09:08,045 --> 00:09:10,700 |
833 00:09:08,045 --> 00:09:10,700 |
834 starting with C. And so we have |
834 starting with c. And so we have |
835 |
835 |
836 181 |
836 181 |
837 00:09:10,700 --> 00:09:14,210 |
837 00:09:10,700 --> 00:09:14,210 |
838 to call the derivative |
838 to call the derivative |
839 for this R2 in this case. |
839 for this r2 in this case. |
840 |
840 |
841 182 |
841 182 |
842 00:09:14,210 --> 00:09:18,680 |
842 00:09:14,210 --> 00:09:18,680 |
843 Otherwise, the R1 will |
843 Otherwise, the r1 will |
844 be in charge of matching |
844 be in charge of matching |
845 |
845 |
846 183 |
846 183 |
847 00:09:18,680 --> 00:09:20,840 |
847 00:09:18,680 --> 00:09:20,840 |
848 a part of that |
848 a part of that |
849 string starting with |
849 string starting with |
850 |
850 |
851 184 |
851 184 |
852 00:09:20,840 --> 00:09:24,695 |
852 00:09:20,840 --> 00:09:24,695 |
853 C. And I have to call the |
853 c. And I have to call the |
854 derivative on this R1. |
854 derivative on this r1. |
855 |
855 |
856 185 |
856 185 |
857 00:09:24,695 --> 00:09:30,670 |
857 00:09:24,695 --> 00:09:30,670 |
858 Okay? I hope this makes sense. |
858 I hope this makes sense. |
859 |
859 |
860 186 |
860 186 |
861 00:09:30,670 --> 00:09:34,150 |
861 00:09:30,670 --> 00:09:34,150 |
862 Go over that and |
862 Go over that and |
863 also the handouts. |
863 also the handouts |
864 |
864 |
865 187 |
865 187 |
866 00:09:34,150 --> 00:09:37,465 |
866 00:09:34,150 --> 00:09:37,465 |
867 Again. With that, that's |
867 again. That case |
868 it's really subtle. |
868 is really subtle. |
869 |
869 |
870 188 |
870 188 |
871 00:09:37,465 --> 00:09:40,945 |
871 00:09:37,465 --> 00:09:40,945 |
872 And how do I remember this case? |
872 And how do I remember this case? |
873 |
873 |
884 00:09:45,310 --> 00:09:48,160 |
884 00:09:45,310 --> 00:09:48,160 |
885 then I will always remember |
885 then I will always remember |
886 |
886 |
887 192 |
887 192 |
888 00:09:48,160 --> 00:09:53,275 |
888 00:09:48,160 --> 00:09:53,275 |
889 the else branch where pushes |
889 the else branch where it pushes |
890 NG derivative over the R1. |
890 the derivative over the r1. |
891 |
891 |
892 193 |
892 193 |
893 00:09:53,275 --> 00:09:55,780 |
893 00:09:53,275 --> 00:09:55,780 |
894 So are usually write |
894 So I usually write |
895 down the S punch |
895 down the else branch first, |
896 |
896 |
897 194 |
897 194 |
898 00:09:55,780 --> 00:09:59,650 |
898 00:09:55,780 --> 00:09:59,650 |
899 first and then construct |
899 and then construct |
900 the thin branch by saying, |
900 the then branch by saying, |
901 |
901 |
902 195 |
902 195 |
903 00:09:59,650 --> 00:10:01,525 |
903 00:09:59,650 --> 00:10:01,525 |
904 well, I just repeat this. |
904 well, I just repeat this. |
905 |
905 |
909 in that case I |
909 in that case I |
910 |
910 |
911 197 |
911 197 |
912 00:10:03,760 --> 00:10:06,580 |
912 00:10:03,760 --> 00:10:06,580 |
913 have to build also |
913 have to build also |
914 derivative of R2. |
914 derivative of r2. |
915 |
915 |
916 198 |
916 198 |
917 00:10:06,580 --> 00:10:12,695 |
917 00:10:06,580 --> 00:10:12,695 |
918 Okay? Finally, the star case. |
918 Finally, the star case. |
919 |
919 |
920 199 |
920 199 |
921 00:10:12,695 --> 00:10:15,665 |
921 00:10:12,695 --> 00:10:15,665 |
922 Ok. So here again |
922 So here again |
923 we're looking for |
923 we're looking for |
924 |
924 |
925 200 |
925 200 |
926 00:10:15,665 --> 00:10:17,300 |
926 00:10:15,665 --> 00:10:17,300 |
927 a reg expression which |
927 a regular expression which |
928 |
928 |
929 201 |
929 201 |
930 00:10:17,300 --> 00:10:19,745 |
930 00:10:17,300 --> 00:10:19,745 |
931 can match the string |
931 can match the string |
932 starting with C, |
932 starting with c, |
933 |
933 |
934 202 |
934 202 |
935 00:10:19,745 --> 00:10:22,355 |
935 00:10:19,745 --> 00:10:22,355 |
936 except that the c has |
936 except that the c has |
937 been chopped off. |
937 been chopped off. |
938 |
938 |
939 203 |
939 203 |
940 00:10:22,355 --> 00:10:28,640 |
940 00:10:22,355 --> 00:10:28,640 |
941 So if r star has to match |
941 So if r* has to match |
942 a string starting with C, |
942 a string starting with c, |
943 |
943 |
944 204 |
944 204 |
945 00:10:28,640 --> 00:10:32,735 |
945 00:10:28,640 --> 00:10:32,735 |
946 then at least we need one |
946 then at least we need one |
947 copy of this r. Okay? |
947 copy of this r. |
948 |
948 |
949 205 |
949 205 |
950 00:10:32,735 --> 00:10:34,310 |
950 00:10:32,735 --> 00:10:34,310 |
951 So at least one copy of |
951 So at least one copy of |
952 |
952 |
955 this r has matched |
955 this r has matched |
956 something which starts with |
956 something which starts with |
957 |
957 |
958 207 |
958 207 |
959 00:10:37,010 --> 00:10:38,870 |
959 00:10:37,010 --> 00:10:38,870 |
960 a C and then afterwards |
960 a c and then afterwards |
961 |
961 |
962 208 |
962 208 |
963 00:10:38,870 --> 00:10:41,570 |
963 00:10:38,870 --> 00:10:41,570 |
964 come 0 more copies |
964 come 0 more copies |
965 of this OnStar. |
965 of this r*. |
966 |
966 |
967 209 |
967 209 |
968 00:10:41,570 --> 00:10:45,530 |
968 00:10:41,570 --> 00:10:45,530 |
969 Okay? What we do there |
969 What we do there |
970 is we trying to find |
970 is we are trying to find |
971 |
971 |
972 210 |
972 210 |
973 00:10:45,530 --> 00:10:47,960 |
973 00:10:45,530 --> 00:10:47,960 |
974 the Rekha expression |
974 the regular expression |
975 which can match |
975 which can match |
976 |
976 |
977 211 |
977 211 |
978 00:10:47,960 --> 00:10:50,915 |
978 00:10:47,960 --> 00:10:50,915 |
979 this first part of the string. |
979 this first part of the string. |
980 |
980 |
981 212 |
981 212 |
982 00:10:50,915 --> 00:10:53,255 |
982 00:10:50,915 --> 00:10:53,255 |
983 However, where the |
983 However, where the |
984 sea is chopped off. |
984 c is chopped off. |
985 |
985 |
986 213 |
986 213 |
987 00:10:53,255 --> 00:10:55,130 |
987 00:10:53,255 --> 00:10:55,130 |
988 And then we just say, well, |
988 And then we just say, well, |
989 |
989 |
996 00:10:57,050 --> 00:10:59,030 |
996 00:10:57,050 --> 00:10:59,030 |
997 0 or more copies of |
997 0 or more copies of |
998 |
998 |
999 216 |
999 216 |
1000 00:10:59,030 --> 00:11:02,600 |
1000 00:10:59,030 --> 00:11:02,600 |
1001 this R. So that's why |
1001 this r. So that's why |
1002 it's defined like this. |
1002 it's defined like this. |
1003 |
1003 |
1004 217 |
1004 217 |
1005 00:11:02,600 --> 00:11:09,050 |
1005 00:11:02,600 --> 00:11:09,050 |
1006 Okay? S8. Please take care |
1006 Okay? ...As said, please take care |
1007 with this definition. |
1007 with this definition. |
1008 |
1008 |
1009 218 |
1009 218 |
1010 00:11:09,050 --> 00:11:11,435 |
1010 00:11:09,050 --> 00:11:11,435 |
1011 That's not so simple. |
1011 That's not so simple. |
1027 different ways in |
1027 different ways in |
1028 the next slides. |
1028 the next slides. |
1029 |
1029 |
1030 223 |
1030 223 |
1031 00:11:20,825 --> 00:11:24,695 |
1031 00:11:20,825 --> 00:11:24,695 |
1032 Okay, let's look |
1032 Let's look |
1033 first some examples. |
1033 first at some examples. |
1034 |
1034 |
1035 224 |
1035 224 |
1036 00:11:24,695 --> 00:11:27,140 |
1036 00:11:24,695 --> 00:11:27,140 |
1037 So here is a regular expression, |
1037 So here is a regular expression, |
1038 |
1038 |
1039 225 |
1039 225 |
1040 00:11:27,140 --> 00:11:29,390 |
1040 00:11:27,140 --> 00:11:29,390 |
1041 R. And let's have |
1041 r. And let's have |
1042 |
1042 |
1043 226 |
1043 226 |
1044 00:11:29,390 --> 00:11:32,450 |
1044 00:11:29,390 --> 00:11:32,450 |
1045 a look at these three |
1045 a look at these three |
1046 derivatives according to a, |
1046 derivatives according to a, |
1047 |
1047 |
1048 227 |
1048 227 |
1049 00:11:32,450 --> 00:11:38,405 |
1049 00:11:32,450 --> 00:11:38,405 |
1050 B, and C. And Vishal do |
1050 b, and c. And we shall do |
1051 with d1 for the a. Ok. |
1051 the one for a. |
1052 |
1052 |
1053 228 |
1053 228 |
1054 00:11:38,405 --> 00:11:42,379 |
1054 00:11:38,405 --> 00:11:42,379 |
1055 So here is our reg expression |
1055 So here is our regular expression |
1056 |
1056 |
1057 229 |
1057 229 |
1058 00:11:42,379 --> 00:11:45,334 |
1058 00:11:42,379 --> 00:11:45,334 |
1059 and was very generous |
1059 and I was very generous |
1060 with dependent a thesis. |
1060 with parentheses. |
1061 |
1061 |
1062 230 |
1062 230 |
1063 00:11:45,334 --> 00:11:48,140 |
1063 00:11:45,334 --> 00:11:48,140 |
1064 And the outermost is a star. |
1064 And the outermost is a star. |
1065 |
1065 |
1066 231 |
1066 231 |
1067 00:11:48,140 --> 00:11:52,550 |
1067 00:11:48,140 --> 00:11:52,550 |
1068 So if people now the |
1068 So we now build the |
1069 derivative according to a, |
1069 derivative according to a, |
1070 |
1070 |
1071 232 |
1071 232 |
1072 00:11:52,550 --> 00:11:55,474 |
1072 00:11:52,550 --> 00:11:55,474 |
1073 the character a of |
1073 the character a, of |
1074 that wreck expression. |
1074 that regular expression. |
1075 |
1075 |
1076 233 |
1076 233 |
1077 00:11:55,474 --> 00:11:57,380 |
1077 00:11:55,474 --> 00:11:57,380 |
1078 Okay? So the first thing we |
1078 So the first thing we |
1079 |
1079 |
1080 234 |
1080 234 |
1081 00:11:57,380 --> 00:11:59,555 |
1081 00:11:57,380 --> 00:11:59,555 |
1082 have to analyze is the K star. |
1082 have to analyse is the case star. |
1083 |
1083 |
1084 235 |
1084 235 |
1085 00:11:59,555 --> 00:12:04,370 |
1085 00:11:59,555 --> 00:12:04,370 |
1086 Ok? So here's direct expression, |
1086 So here's the regular expression, |
1087 which we are looking at. |
1087 which we are looking at. |
1088 |
1088 |
1089 236 |
1089 236 |
1090 00:12:04,370 --> 00:12:09,170 |
1090 00:12:04,370 --> 00:12:09,170 |
1091 This are the outermost |
1091 This r and the outermost |
1092 constructor is this star. |
1092 constructor is this star. |
1093 |
1093 |
1094 237 |
1094 237 |
1095 00:12:09,170 --> 00:12:11,510 |
1095 00:12:09,170 --> 00:12:11,510 |
1096 If you go back to the definition, |
1096 If you go back to the definition, |
1099 00:12:11,510 --> 00:12:13,625 |
1099 00:12:11,510 --> 00:12:13,625 |
1100 I hope you have it next to you, |
1100 I hope you have it next to you, |
1101 |
1101 |
1102 239 |
1102 239 |
1103 00:12:13,625 --> 00:12:16,340 |
1103 00:12:13,625 --> 00:12:16,340 |
1104 then this star case is defined |
1104 then this star-case is defined |
1105 |
1105 |
1106 240 |
1106 240 |
1107 00:12:16,340 --> 00:12:20,000 |
1107 00:12:16,340 --> 00:12:20,000 |
1108 as u taking just the |
1108 as you're taking just the |
1109 inside of the star |
1109 inside of the star |
1110 |
1110 |
1111 241 |
1111 241 |
1112 00:12:20,000 --> 00:12:23,030 |
1112 00:12:20,000 --> 00:12:23,030 |
1113 and apply this derivative and |
1113 and apply this derivative and |
1114 |
1114 |
1115 242 |
1115 242 |
1116 00:12:23,030 --> 00:12:26,765 |
1116 00:12:23,030 --> 00:12:26,765 |
1117 leave the are on the |
1117 leave the r on the |
1118 outside at the end. |
1118 outside at the end. |
1119 |
1119 |
1120 243 |
1120 243 |
1121 00:12:26,765 --> 00:12:29,990 |
1121 00:12:26,765 --> 00:12:29,990 |
1122 Ok. So that's the first step. |
1122 Ok. So that's the first step. |
1126 Now we have to analyze |
1126 Now we have to analyze |
1127 |
1127 |
1128 245 |
1128 245 |
1129 00:12:32,030 --> 00:12:36,035 |
1129 00:12:32,030 --> 00:12:36,035 |
1130 the derivative according to |
1130 the derivative according to |
1131 a of this record expression, |
1131 a of this regular expression, |
1132 |
1132 |
1133 246 |
1133 246 |
1134 00:12:36,035 --> 00:12:38,000 |
1134 00:12:36,035 --> 00:12:38,000 |
1135 which is an alternative. |
1135 which is an alternative. |
1136 |
1136 |
1147 we just have to push the |
1147 we just have to push the |
1148 derivative into each component, |
1148 derivative into each component, |
1149 |
1149 |
1150 250 |
1150 250 |
1151 00:12:45,185 --> 00:12:47,705 |
1151 00:12:45,185 --> 00:12:47,705 |
1152 into the a, followed by b. |
1152 into the a followed by b. |
1153 |
1153 |
1154 251 |
1154 251 |
1155 00:12:47,705 --> 00:12:49,145 |
1155 00:12:47,705 --> 00:12:49,145 |
1156 And in this segment, |
1156 And into the second |
1157 |
1157 |
1158 252 |
1158 252 |
1159 00:12:49,145 --> 00:12:51,185 |
1159 00:12:49,145 --> 00:12:51,185 |
1160 alternative into b. Ok, |
1160 alternative, into b. |
1161 |
1161 |
1162 253 |
1162 253 |
1163 00:12:51,185 --> 00:12:56,030 |
1163 00:12:51,185 --> 00:12:56,030 |
1164 so we take the derivative |
1164 We take the derivative |
1165 of each according to a way. |
1165 of each according to a. |
1166 |
1166 |
1167 254 |
1167 254 |
1168 00:12:56,030 --> 00:13:00,635 |
1168 00:12:56,030 --> 00:13:00,635 |
1169 Now this one is a sequence |
1169 Now this one is a sequence |
1170 break expression. |
1170 regular expression. |
1171 |
1171 |
1172 255 |
1172 255 |
1173 00:13:00,635 --> 00:13:02,210 |
1173 00:13:00,635 --> 00:13:02,210 |
1174 This most complicated case. |
1174 This was the complicated case. |
1175 |
1175 |
1176 256 |
1176 256 |
1177 00:13:02,210 --> 00:13:04,160 |
1177 00:13:02,210 --> 00:13:04,160 |
1178 So the first of all |
1178 First of all |
1179 you have to test is, |
1179 we have to test is |
1180 |
1180 |
1181 257 |
1181 257 |
1182 00:13:04,160 --> 00:13:07,910 |
1182 00:13:04,160 --> 00:13:07,910 |
1183 is the first component |
1183 the first component |
1184 nullable of this sequence? |
1184 nullable of this sequence? |
1185 |
1185 |
1186 258 |
1186 258 |
1187 00:13:07,910 --> 00:13:09,200 |
1187 00:13:07,910 --> 00:13:09,200 |
1188 Well, that is a, |
1188 Well, that is a, |
1192 in this case, a on its |
1192 in this case, a on its |
1193 own is not nullable. |
1193 own is not nullable. |
1194 |
1194 |
1195 260 |
1195 260 |
1196 00:13:12,740 --> 00:13:14,210 |
1196 00:13:12,740 --> 00:13:14,210 |
1197 So vn, the easy case, |
1197 So we are the easy case, |
1198 |
1198 |
1199 261 |
1199 261 |
1200 00:13:14,210 --> 00:13:17,000 |
1200 00:13:14,210 --> 00:13:17,000 |
1201 we only have a single derivative |
1201 we only have a single derivative |
1202 |
1202 |
1218 00:13:27,920 --> 00:13:29,720 |
1218 00:13:27,920 --> 00:13:29,720 |
1219 And in this case the character |
1219 And in this case the character |
1220 |
1220 |
1221 266 |
1221 266 |
1222 00:13:29,720 --> 00:13:31,715 |
1222 00:13:29,720 --> 00:13:31,715 |
1223 in the regular expression agree. |
1223 and the regular expression agree. |
1224 |
1224 |
1225 267 |
1225 267 |
1226 00:13:31,715 --> 00:13:33,890 |
1226 00:13:31,715 --> 00:13:33,890 |
1227 So it's defined as one. |
1227 So it's defined as one. |
1228 |
1228 |
1230 00:13:33,890 --> 00:13:37,550 |
1230 00:13:33,890 --> 00:13:37,550 |
1231 Ok? In the other case, |
1231 Ok? In the other case, |
1232 |
1232 |
1233 269 |
1233 269 |
1234 00:13:37,550 --> 00:13:39,710 |
1234 00:13:37,550 --> 00:13:39,710 |
1235 the break expression is P, |
1235 the regular expression is b, |
1236 |
1236 |
1237 270 |
1237 270 |
1238 00:13:39,710 --> 00:13:41,675 |
1238 00:13:39,710 --> 00:13:41,675 |
1239 But the characters a. |
1239 but the characters a. |
1240 |
1240 |
1241 271 |
1241 271 |
1242 00:13:41,675 --> 00:13:46,385 |
1242 00:13:41,675 --> 00:13:46,385 |
1243 So in this case |
1243 So in this case |
1244 it's defined as 0. |
1244 it's defined as 0. |
1257 because originally we |
1257 because originally we |
1258 started with a star. |
1258 started with a star. |
1259 |
1259 |
1260 275 |
1260 275 |
1261 00:13:55,280 --> 00:13:58,295 |
1261 00:13:55,280 --> 00:13:58,295 |
1262 This expression is that |
1262 This regular expression is that star. |
1263 star at expression. |
|
1264 |
1263 |
1265 276 |
1264 276 |
1266 00:13:58,295 --> 00:14:02,780 |
1265 00:13:58,295 --> 00:14:02,780 |
1267 Ok? So the derivative |
1266 So the derivative |
1268 according to a |
1267 according to a |
1269 |
1268 |
1270 277 |
1269 277 |
1271 00:14:02,780 --> 00:14:07,610 |
1270 00:14:02,780 --> 00:14:07,610 |
1272 up that reg expression |
1271 of that regular expression |
1273 is this expression. |
1272 is this expression. |
1274 |
1273 |
1275 278 |
1274 278 |
1276 00:14:07,610 --> 00:14:10,970 |
1275 00:14:07,610 --> 00:14:10,970 |
1277 We just have to |
1276 We just have to |
1278 substitute this back in. |
1277 substitute this r back in. |
1279 |
1278 |
1280 279 |
1279 279 |
1281 00:14:10,970 --> 00:14:13,745 |
1280 00:14:10,970 --> 00:14:13,745 |
1282 Just coming back to this slide. |
1281 Just coming back to this slide. |
1283 |
1282 |
1284 280 |
1283 280 |
1285 00:14:13,745 --> 00:14:16,160 |
1284 00:14:13,745 --> 00:14:16,160 |
1286 So far, they're only analyze |
1285 So far, we only analyzes |
1287 |
1286 |
1288 281 |
1287 281 |
1289 00:14:16,160 --> 00:14:19,505 |
1288 00:14:16,160 --> 00:14:19,505 |
1290 the derivative according |
1289 the derivative according |
1291 to a single character. |
1290 to a single character. |
1305 to the empty string, |
1304 to the empty string, |
1306 |
1305 |
1307 285 |
1306 285 |
1308 00:14:27,905 --> 00:14:30,875 |
1307 00:14:27,905 --> 00:14:30,875 |
1309 we just return the |
1308 we just return the |
1310 Rekha expression. |
1309 regular expression. |
1311 |
1310 |
1312 286 |
1311 286 |
1313 00:14:30,875 --> 00:14:35,585 |
1312 00:14:30,875 --> 00:14:35,585 |
1314 If we have a string |
1313 If we have a string |
1315 starting with character c, |
1314 starting with character c, |
1316 |
1315 |
1317 287 |
1316 287 |
1318 00:14:35,585 --> 00:14:37,850 |
1317 00:14:35,585 --> 00:14:37,850 |
1319 remember that can |
1318 remember that can |
1320 be any character. |
1319 be any character, |
1321 |
1320 |
1322 288 |
1321 288 |
1323 00:14:37,850 --> 00:14:42,170 |
1322 00:14:37,850 --> 00:14:42,170 |
1324 Then we build first the simple |
1323 then we build first the simple |
1325 derivative according to |
1324 derivative according to |
1326 |
1325 |
1327 289 |
1326 289 |
1328 00:14:42,170 --> 00:14:44,360 |
1327 00:14:42,170 --> 00:14:44,360 |
1329 that first character and |
1328 that first character and |
1334 rest of the string. |
1333 rest of the string. |
1335 |
1334 |
1336 291 |
1335 291 |
1337 00:14:46,925 --> 00:14:50,615 |
1336 00:14:46,925 --> 00:14:50,615 |
1338 So here you see again, |
1337 So here you see again, |
1339 my personal convention. |
1338 my personal convention: |
1340 |
1339 |
1341 292 |
1340 292 |
1342 00:14:50,615 --> 00:14:54,365 |
1341 00:14:50,615 --> 00:14:54,365 |
1343 Everything which works on |
1342 everything which works on |
1344 lists has this S at the end. |
1343 lists has this s at the end. |
1345 |
1344 |
1346 293 |
1345 293 |
1347 00:14:54,365 --> 00:14:57,125 |
1346 00:14:54,365 --> 00:14:57,125 |
1348 So this function is |
1347 So this function is |
1349 for single characters. |
1348 for single characters. |
1385 state what the algorithm |
1384 state what the algorithm |
1386 is, the complete algorithm. |
1385 is, the complete algorithm. |
1387 |
1386 |
1388 302 |
1387 302 |
1389 00:15:20,600 --> 00:15:23,465 |
1388 00:15:20,600 --> 00:15:23,465 |
1390 So the pro shops ki mantra |
1389 So the Brzozowski matcher |
1391 |
1390 |
1392 303 |
1391 303 |
1393 00:15:23,465 --> 00:15:24,860 |
1392 00:15:23,465 --> 00:15:24,860 |
1394 takes a regular expression as |
1393 takes a regular expression as |
1395 |
1394 |
1399 string as argument. |
1398 string as argument. |
1400 |
1399 |
1401 305 |
1400 305 |
1402 00:15:26,915 --> 00:15:30,920 |
1401 00:15:26,915 --> 00:15:30,920 |
1403 And is supposed to say yes if |
1402 And is supposed to say yes if |
1404 the reg expression matches |
1403 the regular expression matches |
1405 |
1404 |
1406 306 |
1405 306 |
1407 00:15:30,920 --> 00:15:33,560 |
1406 00:15:30,920 --> 00:15:33,560 |
1408 the string or No |
1407 the string or no |
1409 |
1408 |
1410 307 |
1409 307 |
1411 00:15:33,560 --> 00:15:36,065 |
1410 00:15:33,560 --> 00:15:36,065 |
1412 if the reg expression does |
1411 if the regular expression does |
1413 not match the string. |
1412 not match the string. |
1414 |
1413 |
1415 308 |
1414 308 |
1416 00:15:36,065 --> 00:15:37,715 |
1415 00:15:36,065 --> 00:15:37,715 |
1417 And how does it do that? |
1416 And how does it do that? |
1418 |
1417 |
1419 309 |
1418 309 |
1420 00:15:37,715 --> 00:15:42,560 |
1419 00:15:37,715 --> 00:15:42,560 |
1421 Well, it takes this string |
1420 Well, it takes this string |
1422 s And this reg expression, |
1421 s and this regular expression, |
1423 |
1422 |
1424 310 |
1423 310 |
1425 00:15:42,560 --> 00:15:43,925 |
1424 00:15:42,560 --> 00:15:43,925 |
1426 and it first built |
1425 and it first builds |
1427 |
1426 |
1428 311 |
1427 311 |
1429 00:15:43,925 --> 00:15:48,845 |
1428 00:15:43,925 --> 00:15:48,845 |
1430 successive derivatives until |
1429 successive derivatives until |
1431 that string is exhaust that. |
1430 that string is exhausted. |
1432 |
1431 |
1433 312 |
1432 312 |
1434 00:15:48,845 --> 00:15:52,115 |
1433 00:15:48,845 --> 00:15:52,115 |
1435 Okay? Then you have |
1434 Then you have |
1436 a final derivative, |
1435 a final derivative, |
1437 |
1436 |
1438 313 |
1437 313 |
1439 00:15:52,115 --> 00:15:53,839 |
1438 00:15:52,115 --> 00:15:53,839 |
1440 a final reg expression. |
1439 a final regular expression |
1441 |
1440 |
1442 314 |
1441 314 |
1443 00:15:53,839 --> 00:15:55,370 |
1442 00:15:53,839 --> 00:15:55,370 |
1444 And you test whether |
1443 and you test whether |
1445 |
1444 |
1446 315 |
1445 315 |
1447 00:15:55,370 --> 00:15:57,920 |
1446 00:15:55,370 --> 00:15:57,920 |
1448 this reg expression can |
1447 this regular expression can |
1449 match the empty string. |
1448 match the empty string. |
1450 |
1449 |
1451 316 |
1450 316 |
1452 00:15:57,920 --> 00:16:01,370 |
1451 00:15:57,920 --> 00:16:01,370 |
1453 If yes, then the |
1452 If yes, then the |
1454 original reg expression |
1453 original regular expression, |
1455 |
1454 |
1456 317 |
1455 317 |
1457 00:16:01,370 --> 00:16:03,245 |
1456 00:16:01,370 --> 00:16:03,245 |
1458 is r can match the string. |
1457 this r, can match the string. |
1459 |
1458 |
1460 318 |
1459 318 |
1461 00:16:03,245 --> 00:16:05,210 |
1460 00:16:03,245 --> 00:16:05,210 |
1462 If no, if it cannot match |
1461 If no, if it cannot match |
1463 |
1462 |
1466 the final derivative |
1465 the final derivative |
1467 with the empty string, |
1466 with the empty string, |
1468 |
1467 |
1469 320 |
1468 320 |
1470 00:16:08,030 --> 00:16:10,280 |
1469 00:16:08,030 --> 00:16:10,280 |
1471 then know this |
1470 then no this |
1472 regular expression, |
1471 regular expression r |
1473 |
1472 |
1474 321 |
1473 321 |
1475 00:16:10,280 --> 00:16:12,905 |
1474 00:16:10,280 --> 00:16:12,905 |
1476 R cannot match that string. |
1475 cannot match that string. |
1477 |
1476 |
1478 322 |
1477 322 |
1479 00:16:12,905 --> 00:16:16,010 |
1478 00:16:12,905 --> 00:16:16,010 |
1480 I know it looks |
1479 I know it looks |
1481 very anticlimactic, |
1480 very anticlimactic, |
1489 00:16:19,625 --> 00:16:22,760 |
1488 00:16:19,625 --> 00:16:22,760 |
1490 that it's not that complicated. |
1489 that it's not that complicated. |
1491 |
1490 |
1492 325 |
1491 325 |
1493 00:16:22,760 --> 00:16:25,340 |
1492 00:16:22,760 --> 00:16:25,340 |
1494 So how does the algorithm work? |
1493 So how does the algorithm work |
1495 |
1494 |
1496 326 |
1495 326 |
1497 00:16:25,340 --> 00:16:27,634 |
1496 00:16:25,340 --> 00:16:27,634 |
1498 In a concrete example? |
1497 in a concrete example? |
1499 |
1498 |
1500 327 |
1499 327 |
1501 00:16:27,634 --> 00:16:31,580 |
1500 00:16:27,634 --> 00:16:31,580 |
1502 Imagine you have a string, abc |
1501 Imagine you have a string, abc |
1503 |
1502 |
1504 328 |
1503 328 |
1505 00:16:31,580 --> 00:16:34,370 |
1504 00:16:31,580 --> 00:16:34,370 |
1506 and you have a break |
1505 and you have a regular |
1507 expression, say R1. |
1506 expression, say r1. |
1508 |
1507 |
1509 329 |
1508 329 |
1510 00:16:34,370 --> 00:16:37,520 |
1509 00:16:34,370 --> 00:16:37,520 |
1511 And you want to find out |
1510 And you want to find out |
1512 whether this or one can match |
1511 whether this r1 can match |
1513 |
1512 |
1514 330 |
1513 330 |
1515 00:16:37,520 --> 00:16:41,300 |
1514 00:16:37,520 --> 00:16:41,300 |
1516 that string abc or not. |
1515 that string abc or not. |
1517 How do you do that? |
1516 How do you do that? |
1518 |
1517 |
1519 331 |
1518 331 |
1520 00:16:41,300 --> 00:16:45,140 |
1519 00:16:41,300 --> 00:16:45,140 |
1521 Well, you would first |
1520 Well, you would first |
1522 take this R1 and you |
1521 take this r1 and you |
1523 |
1522 |
1524 332 |
1523 332 |
1525 00:16:45,140 --> 00:16:47,150 |
1524 00:16:45,140 --> 00:16:47,150 |
1526 build the derivative according |
1525 build the derivative according |
1527 |
1526 |
1528 333 |
1527 333 |
1529 00:16:47,150 --> 00:16:49,880 |
1528 00:16:47,150 --> 00:16:49,880 |
1530 to the first character, D-A. |
1529 to the first character, the a. |
1531 |
1530 |
1532 334 |
1531 334 |
1533 00:16:49,880 --> 00:16:53,015 |
1532 00:16:49,880 --> 00:16:53,015 |
1534 Okay? You get the derivative, |
1533 Okay? You get the derivative, |
1535 |
1534 |
1536 335 |
1535 335 |
1537 00:16:53,015 --> 00:16:55,294 |
1536 00:16:53,015 --> 00:16:55,294 |
1538 which I call here R2. |
1537 which I call here r2. |
1539 |
1538 |
1540 336 |
1539 336 |
1541 00:16:55,294 --> 00:16:58,640 |
1540 00:16:55,294 --> 00:16:58,640 |
1542 Then you take the next |
1541 Then you take the next |
1543 character, the B. |
1542 character, the b. |
1544 |
1543 |
1545 337 |
1544 337 |
1546 00:16:58,640 --> 00:17:04,535 |
1545 00:16:58,640 --> 00:17:04,535 |
1547 You now build the derivative |
1546 You now build the derivative |
1548 according to b of this R2. |
1547 according to b of this r2. |
1549 |
1548 |
1550 338 |
1549 338 |
1551 00:17:04,535 --> 00:17:07,460 |
1550 00:17:04,535 --> 00:17:07,460 |
1552 Okay? So you take the |
1551 Okay? So you take the |
1553 result of the first step, |
1552 result of the first step, |
1562 second character. |
1561 second character. |
1563 |
1562 |
1564 341 |
1563 341 |
1565 00:17:11,810 --> 00:17:17,075 |
1564 00:17:11,810 --> 00:17:17,075 |
1566 Then you do this also with c. |
1565 Then you do this also with c. |
1567 So you get a derivative R3, |
1566 So you get a derivative r3, |
1568 |
1567 |
1569 342 |
1568 342 |
1570 00:17:17,075 --> 00:17:22,460 |
1569 00:17:17,075 --> 00:17:22,460 |
1571 and you build the derivative |
1570 and you build the derivative |
1572 of R three according to c, |
1571 of r3 according to c, |
1573 |
1572 |
1574 343 |
1573 343 |
1575 00:17:22,460 --> 00:17:24,185 |
1574 00:17:22,460 --> 00:17:24,185 |
1576 you get an R four. |
1575 you get an r4. |
1577 |
1576 |
1578 344 |
1577 344 |
1579 00:17:24,185 --> 00:17:26,300 |
1578 00:17:24,185 --> 00:17:26,300 |
1580 Okay, so that's the |
1579 Okay, so that's the |
1581 final derivative. |
1580 final derivative. |
1585 The string is exhausted. |
1584 The string is exhausted. |
1586 |
1585 |
1587 346 |
1586 346 |
1588 00:17:27,530 --> 00:17:29,570 |
1587 00:17:27,530 --> 00:17:29,570 |
1589 We build derivatives |
1588 We build derivatives |
1590 according to a, B, |
1589 according to a, b, |
1591 |
1590 |
1592 347 |
1591 347 |
1593 00:17:29,570 --> 00:17:34,610 |
1592 00:17:29,570 --> 00:17:34,610 |
1594 and C. Now we just test whether |
1593 and c. Now we just test whether |
1595 this r four is nullable. |
1594 this r4 is nullable. |
1596 |
1595 |
1597 348 |
1596 348 |
1598 00:17:34,610 --> 00:17:37,175 |
1597 00:17:34,610 --> 00:17:37,175 |
1599 If it says yes, |
1598 If it says yes, |
1600 |
1599 |
1601 349 |
1600 349 |
1602 00:17:37,175 --> 00:17:41,510 |
1601 00:17:37,175 --> 00:17:41,510 |
1603 then df break expression metro |
1602 then the regular expression matcher |
1604 will just say true, yes, |
1603 will just say true, yes, |
1605 |
1604 |
1606 350 |
1605 350 |
1607 00:17:41,510 --> 00:17:43,340 |
1606 00:17:41,510 --> 00:17:43,340 |
1608 this original reg expression, |
1607 this original regular expression, |
1609 |
1608 |
1610 351 |
1609 351 |
1611 00:17:43,340 --> 00:17:47,270 |
1610 00:17:43,340 --> 00:17:47,270 |
1612 the R1, will be able to |
1611 the r1, will be able to |
1613 match that string abc. |
1612 match that string abc. |
1614 |
1613 |
1615 352 |
1614 352 |
1616 00:17:47,270 --> 00:17:50,585 |
1615 00:17:47,270 --> 00:17:50,585 |
1617 And if this test returns false, |
1616 And if this test returns false, |
1620 00:17:50,585 --> 00:17:53,015 |
1619 00:17:50,585 --> 00:17:53,015 |
1621 then the algorithm says false. |
1620 then the algorithm says false. |
1622 |
1621 |
1623 354 |
1622 354 |
1624 00:17:53,015 --> 00:17:56,975 |
1623 00:17:53,015 --> 00:17:56,975 |
1625 This reg expression will |
1624 This regular expression will |
1626 not match the string. |
1625 not match the string. |
1627 |
1626 |
1628 355 |
1627 355 |
1629 00:17:56,975 --> 00:18:00,260 |
1628 00:17:56,975 --> 00:18:00,260 |
1630 Ok, you might ask |
1629 Ok, you might ask: |
1631 why on earth does |
1630 Why on earth does |
1632 |
1631 |
1633 356 |
1632 356 |
1634 00:18:00,260 --> 00:18:02,960 |
1633 00:18:00,260 --> 00:18:02,960 |
1635 that algorithm |
1634 that algorithm |
1636 actually work away? |
1635 actually work? |
1637 |
1636 |
1638 357 |
1637 357 |
1639 00:18:02,960 --> 00:18:06,515 |
1638 00:18:02,960 --> 00:18:06,515 |
1640 Here's an explanation |
1639 Here's anather explanation |
1641 for why it works. |
1640 for why it works. |
1642 |
1641 |
1643 358 |
1642 358 |
1644 00:18:06,515 --> 00:18:10,190 |
1643 00:18:06,515 --> 00:18:10,190 |
1645 Imagine you have a wreck |
1644 Imagine you have a regular |
1646 expression R1, okay? |
1645 expression r1, okay? |
1647 |
1646 |
1648 359 |
1647 359 |
1649 00:18:10,190 --> 00:18:13,220 |
1648 00:18:10,190 --> 00:18:13,220 |
1650 And you have a string abc, |
1649 And you have a string abc, |
1651 |
1650 |
1653 00:18:13,220 --> 00:18:14,270 |
1652 00:18:13,220 --> 00:18:14,270 |
1654 and you want to find out |
1653 and you want to find out |
1655 |
1654 |
1656 361 |
1655 361 |
1657 00:18:14,270 --> 00:18:17,180 |
1656 00:18:14,270 --> 00:18:17,180 |
1658 whether one can |
1657 whether r1 can |
1659 match that string. |
1658 match that string. |
1660 |
1659 |
1661 362 |
1660 362 |
1662 00:18:17,180 --> 00:18:18,799 |
1661 00:18:17,180 --> 00:18:18,799 |
1663 And for the moment, |
1662 And for the moment, |
1671 00:18:22,610 --> 00:18:26,315 |
1670 00:18:22,610 --> 00:18:26,315 |
1672 Ok? So the language L of |
1671 Ok? So the language L of |
1673 |
1672 |
1674 365 |
1673 365 |
1675 00:18:26,315 --> 00:18:30,185 |
1674 00:18:26,315 --> 00:18:30,185 |
1676 R will actually |
1675 r1 will actually |
1677 contain that string, |
1676 contain that string, |
1678 |
1677 |
1679 366 |
1678 366 |
1680 00:18:30,185 --> 00:18:31,805 |
1679 00:18:30,185 --> 00:18:31,805 |
1681 otherwise it wouldn't match that. |
1680 otherwise it wouldn't match that. |
1682 |
1681 |
1683 367 |
1682 367 |
1684 00:18:31,805 --> 00:18:36,710 |
1683 00:18:31,805 --> 00:18:36,710 |
1685 Okay? So ABC is in |
1684 Okay? So abc is in |
1686 this language, okay? |
1685 this language, okay? |
1687 |
1686 |
1688 368 |
1687 368 |
1689 00:18:36,710 --> 00:18:39,785 |
1688 00:18:36,710 --> 00:18:39,785 |
1690 If I now take the |
1689 If I now take the |
1691 semantic derivative, |
1690 semantic derivative, |
1692 |
1691 |
1693 369 |
1692 369 |
1694 00:18:39,785 --> 00:18:43,145 |
1693 00:18:39,785 --> 00:18:43,145 |
1695 that means I look at all |
1694 that means I look at all |
1696 the strings in this f, |
1695 the strings in this L(r1) |
1697 |
1696 |
1698 370 |
1697 370 |
1699 00:18:43,145 --> 00:18:46,820 |
1698 00:18:43,145 --> 00:18:46,820 |
1700 R1, and further out |
1699 and filter out |
1701 |
1700 |
1702 371 |
1701 371 |
1703 00:18:46,820 --> 00:18:48,740 |
1702 00:18:46,820 --> 00:18:48,740 |
1704 all the ones which |
1703 all the ones which |
1705 do not start with |
1704 do not start with |
1708 00:18:48,740 --> 00:18:51,260 |
1707 00:18:48,740 --> 00:18:51,260 |
1709 an a, I discharge them. |
1708 an a, I discharge them. |
1710 |
1709 |
1711 373 |
1710 373 |
1712 00:18:51,260 --> 00:18:54,545 |
1711 00:18:51,260 --> 00:18:54,545 |
1713 And I only look the one |
1712 And I only look at the ones |
1714 which start with an a. |
1713 which start with an a. |
1715 |
1714 |
1716 374 |
1715 374 |
1717 00:18:54,545 --> 00:18:56,465 |
1716 00:18:54,545 --> 00:18:56,465 |
1718 And of those strings, |
1717 And of those strings, |
1722 I chop off this a. |
1721 I chop off this a. |
1723 |
1722 |
1724 376 |
1723 376 |
1725 00:18:58,475 --> 00:19:01,025 |
1724 00:18:58,475 --> 00:19:01,025 |
1726 So after this |
1725 So after this |
1727 romantic derivative, |
1726 semantic derivative, |
1728 |
1727 |
1729 377 |
1728 377 |
1730 00:19:01,025 --> 00:19:05,735 |
1729 00:19:01,025 --> 00:19:05,735 |
1731 this set of strings will |
1730 this set of strings will |
1732 contain just B and C. |
1731 contain just bc. |
1733 |
1732 |
1734 378 |
1733 378 |
1735 00:19:05,735 --> 00:19:12,830 |
1734 00:19:05,735 --> 00:19:12,830 |
1736 Ok. Now if I build the next |
1735 Ok. Now if I build the next |
1737 semantic derivative of that, |
1736 semantic derivative of that, |
1741 then I would look at |
1740 then I would look at |
1742 |
1741 |
1743 380 |
1742 380 |
1744 00:19:14,345 --> 00:19:16,850 |
1743 00:19:14,345 --> 00:19:16,850 |
1745 all the strings which |
1744 all the strings which |
1746 start with a P, |
1745 start with a b, |
1747 |
1746 |
1748 381 |
1747 381 |
1749 00:19:16,850 --> 00:19:21,350 |
1748 00:19:16,850 --> 00:19:21,350 |
1750 and forget about everything |
1749 and forget about everything |
1751 else of the ones. |
1750 else. Of the remaining ones |
1752 |
1751 |
1753 382 |
1752 382 |
1754 00:19:21,350 --> 00:19:27,905 |
1753 00:19:21,350 --> 00:19:27,905 |
1755 I know they start with B. |
1754 I know they start with b. |
1756 I just chop of the B. Ok. |
1755 I just chop of the b. Ok. |
1757 |
1756 |
1758 383 |
1757 383 |
1759 00:19:27,905 --> 00:19:31,655 |
1758 00:19:27,905 --> 00:19:31,655 |
1760 So in this whole set here, |
1759 So in this whole set here, |
1761 |
1760 |
1774 semantic derivative |
1773 semantic derivative |
1775 |
1774 |
1776 387 |
1775 387 |
1777 00:19:44,420 --> 00:19:47,300 |
1776 00:19:44,420 --> 00:19:47,300 |
1778 because I want to find out |
1777 because I want to find out |
1779 whether ABC is involved. |
1778 whether abc is matched. |
1780 |
1779 |
1781 388 |
1780 388 |
1782 00:19:47,300 --> 00:19:50,540 |
1781 00:19:47,300 --> 00:19:50,540 |
1783 Okay? So now I look |
1782 Okay? So now I look |
1784 at all the strings in |
1783 at all the strings in |
1788 here and look at |
1787 here and look at |
1789 |
1788 |
1790 390 |
1789 390 |
1791 00:19:52,820 --> 00:19:55,340 |
1790 00:19:52,820 --> 00:19:55,340 |
1792 them whether they start |
1791 them whether they start |
1793 with a C. If yes, |
1792 with a c. If yes, |
1794 |
1793 |
1795 391 |
1794 391 |
1796 00:19:55,340 --> 00:19:56,885 |
1795 00:19:55,340 --> 00:19:56,885 |
1797 I chop off the sea. |
1796 I chop off the c. |
1798 |
1797 |
1799 392 |
1798 392 |
1800 00:19:56,885 --> 00:19:59,120 |
1799 00:19:56,885 --> 00:19:59,120 |
1801 And put in markets remaining. |
1800 And put in what is remaining. |
1802 |
1801 |
1803 393 |
1802 393 |
1804 00:19:59,120 --> 00:20:00,425 |
1803 00:19:59,120 --> 00:20:00,425 |
1805 So in this case, |
1804 So in this case, |
1806 |
1805 |
1809 if I have the string c |
1808 if I have the string c |
1810 |
1809 |
1811 395 |
1810 395 |
1812 00:20:02,510 --> 00:20:04,550 |
1811 00:20:02,510 --> 00:20:04,550 |
1813 in this language and |
1812 in this language and |
1814 I chop off this, |
1813 I chop off this c, |
1815 |
1814 |
1816 396 |
1815 396 |
1817 00:20:04,550 --> 00:20:07,700 |
1816 00:20:04,550 --> 00:20:07,700 |
1818 see what is remaining |
1817 what is remaining |
1819 is the empty string. |
1818 is the empty string. |
1820 |
1819 |
1821 397 |
1820 397 |
1822 00:20:07,700 --> 00:20:09,695 |
1821 00:20:07,700 --> 00:20:09,695 |
1823 So we have to check of |
1822 So we have to check of |
1828 contains the empty string. |
1827 contains the empty string. |
1829 |
1828 |
1830 399 |
1829 399 |
1831 00:20:14,510 --> 00:20:18,800 |
1830 00:20:14,510 --> 00:20:18,800 |
1832 If yes, then the |
1831 If yes, then the |
1833 original R1 can match |
1832 original r1 can match |
1834 |
1833 |
1835 400 |
1834 400 |
1836 00:20:18,800 --> 00:20:21,050 |
1835 00:20:18,800 --> 00:20:21,050 |
1837 this ABC because this ABC |
1836 this abc because this abc |
1838 |
1837 |
1839 401 |
1838 401 |
1840 00:20:21,050 --> 00:20:24,119 |
1839 00:20:21,050 --> 00:20:24,119 |
1841 must have been in this language. |
1840 must have been in this language. |
1842 |
1841 |
1843 402 |
1842 402 |
1844 00:20:24,130 --> 00:20:28,565 |
1843 00:20:24,130 --> 00:20:28,565 |
1845 And if in the end there wasn't |
1844 And if in the end there wasn't |
1846 the empty string, then, |
1845 the empty string, then |
1847 |
1846 |
1848 403 |
1847 403 |
1849 00:20:28,565 --> 00:20:33,575 |
1848 00:20:28,565 --> 00:20:33,575 |
1850 then this ABC Watson in |
1849 this abs was not in |
1851 this language of one. |
1850 this language of r1. |
1852 |
1851 |
1853 404 |
1852 404 |
1854 00:20:33,575 --> 00:20:36,260 |
1853 00:20:33,575 --> 00:20:36,260 |
1855 And so the electron must have, |
1854 And so the lexer must have, |
1856 |
1855 |
1857 405 |
1856 405 |
1858 00:20:36,260 --> 00:20:38,880 |
1857 00:20:36,260 --> 00:20:38,880 |
1859 or the metro must have failed. |
1858 or the matcher must have failed. |
1860 |
1859 |
1861 406 |
1860 406 |
1862 00:20:39,040 --> 00:20:42,530 |
1861 00:20:39,040 --> 00:20:42,530 |
1863 The clever bit is that here |
1862 The clever bit is that here |
1864 |
1863 |
1881 So that's not really |
1880 So that's not really |
1882 an algorithm. |
1881 an algorithm. |
1883 |
1882 |
1884 411 |
1883 411 |
1885 00:20:55,730 --> 00:20:58,880 |
1884 00:20:55,730 --> 00:20:58,880 |
1886 Yeah, that's just |
1885 That's just |
1887 explaining the idea with |
1886 explaining the idea. |
1888 |
1887 |
1889 412 |
1888 412 |
1890 00:20:58,880 --> 00:21:02,525 |
1889 00:20:58,880 --> 00:21:02,525 |
1891 preserves key |
1890 What Brzozowski |
1892 achieved was that he |
1891 achieved was that he |
1893 |
1892 |
1894 413 |
1893 413 |
1895 00:21:02,525 --> 00:21:06,440 |
1894 00:21:02,525 --> 00:21:06,440 |
1896 now works with this derivative |
1895 now works with this derivative |
1897 America expressions and |
1896 regular expressions and |
1898 |
1897 |
1899 414 |
1898 414 |
1900 00:21:06,440 --> 00:21:10,715 |
1899 00:21:06,440 --> 00:21:10,715 |
1901 somehow imitates what |
1900 somehow imitates what |
1902 happens on these languages. |
1901 happens on these languages. |
1903 |
1902 |
1904 415 |
1903 415 |
1905 00:21:10,715 --> 00:21:14,135 |
1904 00:21:10,715 --> 00:21:14,135 |
1906 Because remember if you |
1905 Because remember if you |
1907 have an wreck expression, |
1906 have a regular expression, |
1908 |
1907 |
1909 416 |
1908 416 |
1910 00:21:14,135 --> 00:21:17,405 |
1909 00:21:14,135 --> 00:21:17,405 |
1911 are you want to test |
1910 and you want to test |
1912 whether can match APC, |
1911 whether it can match abc, |
1913 |
1912 |
1914 417 |
1913 417 |
1915 00:21:17,405 --> 00:21:22,550 |
1914 00:21:17,405 --> 00:21:22,550 |
1916 then you take first |
1915 then you take first |
1917 derivative according to a. |
1916 derivative according to a. |
1918 |
1917 |
1919 418 |
1918 418 |
1920 00:21:22,550 --> 00:21:25,760 |
1919 00:21:22,550 --> 00:21:25,760 |
1921 So you will get a wreck |
1920 So you will get a regular |
1922 expression which can match b |
1921 expression which can match b |
1923 |
1922 |
1924 419 |
1923 419 |
1925 00:21:25,760 --> 00:21:29,464 |
1924 00:21:25,760 --> 00:21:29,464 |
1926 and c If R could match abc. |
1925 and c, if r could match abc. |
1927 |
1926 |
1928 420 |
1927 420 |
1929 00:21:29,464 --> 00:21:31,430 |
1928 00:21:29,464 --> 00:21:31,430 |
1930 So after the first derivative, |
1929 So after the first derivative, |
1931 |
1930 |
1932 421 |
1931 421 |
1933 00:21:31,430 --> 00:21:33,620 |
1932 00:21:31,430 --> 00:21:33,620 |
1934 you will get a wreck expression |
1933 you will get a regular expression |
1935 which can match B and |
1934 which can match b and |
1936 |
1935 |
1937 422 |
1936 422 |
1938 00:21:33,620 --> 00:21:37,070 |
1937 00:21:33,620 --> 00:21:37,070 |
1939 C. If you take the |
1938 c. If you take the |
1940 second derivative, |
1939 second derivative, |
1941 |
1940 |
1942 423 |
1941 423 |
1943 00:21:37,070 --> 00:21:41,225 |
1942 00:21:37,070 --> 00:21:41,225 |
1944 you will get a reexpression |
1943 you will get a regular expression |
1945 which can match c alone. |
1944 which can match c alone. |
1946 |
1945 |
1947 424 |
1946 424 |
1948 00:21:41,225 --> 00:21:44,180 |
1947 00:21:41,225 --> 00:21:44,180 |
1949 And if you take the |
1948 And if you take the |
1953 00:21:44,180 --> 00:21:46,070 |
1952 00:21:44,180 --> 00:21:46,070 |
1954 then you will get |
1953 then you will get |
1955 |
1954 |
1956 426 |
1955 426 |
1957 00:21:46,070 --> 00:21:48,260 |
1956 00:21:46,070 --> 00:21:48,260 |
1958 rec expression which hopefully |
1957 a regular expression which hopefully |
1959 |
1958 |
1960 427 |
1959 427 |
1961 00:21:48,260 --> 00:21:49,715 |
1960 00:21:48,260 --> 00:21:49,715 |
1962 can match the empty string. |
1961 can match the empty string. |
1963 |
1962 |
1964 428 |
1963 428 |
1965 00:21:49,715 --> 00:21:53,780 |
1964 00:21:49,715 --> 00:21:53,780 |
1966 If it does, then this |
1965 If it does, then this |
1967 R can match the ABC. |
1966 r can match the abc. |
1968 |
1967 |
1969 429 |
1968 429 |
1970 00:21:53,780 --> 00:21:55,655 |
1969 00:21:53,780 --> 00:21:55,655 |
1971 And if it doesn't, then |
1970 And if it doesn't, then |
1972 |
1971 |
1973 430 |
1972 430 |
1974 00:21:55,655 --> 00:21:58,680 |
1973 00:21:55,655 --> 00:21:58,680 |
1975 ABC couldn't be |
1974 abc couldn't be |
1976 matched by this on. |
1975 matched by this r. |
1977 |
1976 |
1978 431 |
1977 431 |
1979 00:21:58,900 --> 00:22:02,990 |
1978 00:21:58,900 --> 00:22:02,990 |
1980 Okay, let's have a look |
1979 Okay, let's have a look |
1981 how this pans out in code. |
1980 how this pans out in code. |
1982 |
1981 |
1983 432 |
1982 432 |
1984 00:22:02,990 --> 00:22:06,050 |
1983 00:22:02,990 --> 00:22:06,050 |
1985 Here's defile RE1. |
1984 Here's the file re1.sc. |
1986 |
1985 |
1987 433 |
1986 433 |
1988 00:22:06,050 --> 00:22:07,940 |
1987 00:22:06,050 --> 00:22:07,940 |
1989 It's also uploaded on Keith, |
1988 It's also uploaded on KEATS, |
1990 |
1989 |
1991 434 |
1990 434 |
1992 00:22:07,940 --> 00:22:10,625 |
1991 00:22:07,940 --> 00:22:10,625 |
1993 so you can see exactly |
1992 so you can see exactly |
1994 what I'm doing. |
1993 what I'm doing. |
1995 |
1994 |
1996 435 |
1995 435 |
1997 00:22:10,625 --> 00:22:13,970 |
1996 00:22:10,625 --> 00:22:13,970 |
1998 And actually I already saw |
1997 And actually you already saw |
1999 that file because I showed you |
1998 that file because I showed you |
2000 |
1999 |
2001 436 |
2000 436 |
2002 00:22:13,970 --> 00:22:15,710 |
2001 00:22:13,970 --> 00:22:15,710 |
2003 how my wreck expressions are |
2002 how my regular expressions are |
2004 |
2003 |
2005 437 |
2004 437 |
2006 00:22:15,710 --> 00:22:17,960 |
2005 00:22:15,710 --> 00:22:17,960 |
2007 defined with the |
2006 defined, with the |
2008 abstract classes here. |
2007 abstract classes here. |
2009 |
2008 |
2010 438 |
2009 438 |
2011 00:22:17,960 --> 00:22:21,155 |
2010 00:22:17,960 --> 00:22:21,155 |
2012 And here, the six cases |
2011 And here, the six cases |
2013 for 0-1 character, |
2012 for 0, 1, character, |
2014 |
2013 |
2015 439 |
2014 439 |
2016 00:22:21,155 --> 00:22:23,540 |
2015 00:22:21,155 --> 00:22:23,540 |
2017 I turn a TIF in |
2016 alternative, |
2018 sequence and star. |
2017 sequence and star. |
2019 |
2018 |
2020 440 |
2019 440 |
2021 00:22:23,540 --> 00:22:26,705 |
2020 00:22:23,540 --> 00:22:26,705 |
2022 Ok. So the first |
2021 Ok. So the first |
2041 we just go through |
2040 we just go through |
2042 all these six cases |
2041 all these six cases |
2043 |
2042 |
2044 445 |
2043 445 |
2045 00:22:37,040 --> 00:22:38,900 |
2044 00:22:37,040 --> 00:22:38,900 |
2046 are serious defined as false. |
2045 0 is defined as false. |
2047 |
2046 |
2048 446 |
2047 446 |
2049 00:22:38,900 --> 00:22:43,234 |
2048 00:22:38,900 --> 00:22:43,234 |
2050 One is defined as true |
2049 One is defined as true. |
2051 character for any character, |
2050 Character, for any character, |
2052 |
2051 |
2053 447 |
2052 447 |
2054 00:22:43,234 --> 00:22:45,455 |
2053 00:22:43,234 --> 00:22:45,455 |
2055 this null return false. |
2054 this nullable will return false. |
2056 |
2055 |
2057 448 |
2056 448 |
2058 00:22:45,455 --> 00:22:47,540 |
2057 00:22:45,455 --> 00:22:47,540 |
2059 The alternative is to find here, |
2058 The alternative is defined here, |
2060 |
2059 |
2061 449 |
2060 449 |
2062 00:22:47,540 --> 00:22:50,000 |
2061 00:22:47,540 --> 00:22:50,000 |
2063 so that's the or in Scala. |
2062 so that's the or in Scala. |
2064 |
2063 |
2065 450 |
2064 450 |
2066 00:22:50,000 --> 00:22:52,700 |
2065 00:22:50,000 --> 00:22:52,700 |
2067 And for the sequence, |
2066 And for the sequence, |
2068 that's the end. |
2067 that's the and. |
2069 |
2068 |
2070 451 |
2069 451 |
2071 00:22:52,700 --> 00:22:56,180 |
2070 00:22:52,700 --> 00:22:56,180 |
2072 And this star, no matter |
2071 And this star, no matter |
2073 what the reg expression is, |
2072 what the regular expression is, |
2074 |
2073 |
2075 452 |
2074 452 |
2076 00:22:56,180 --> 00:22:59,540 |
2075 00:22:56,180 --> 00:22:59,540 |
2077 it will always match the |
2076 it will always match the |
2078 empty string, so true. |
2077 empty string, so true. |
2079 |
2078 |
2080 453 |
2079 453 |
2081 00:22:59,540 --> 00:23:02,225 |
2080 00:22:59,540 --> 00:23:02,225 |
2082 So nanobots, very easy. |
2081 So nullable is very easy. |
2083 |
2082 |
2084 454 |
2083 454 |
2085 00:23:02,225 --> 00:23:07,430 |
2084 00:23:02,225 --> 00:23:07,430 |
2086 The derivative is also not |
2085 The derivative is also not |
2087 so much more complicated. |
2086 so much more complicated. |
2091 It takes two arguments, |
2090 It takes two arguments, |
2092 |
2091 |
2093 456 |
2092 456 |
2094 00:23:08,974 --> 00:23:11,810 |
2093 00:23:08,974 --> 00:23:11,810 |
2095 a character and the |
2094 a character and the |
2096 rec expression, |
2095 regular expression, |
2097 |
2096 |
2098 457 |
2097 457 |
2099 00:23:11,810 --> 00:23:14,405 |
2098 00:23:11,810 --> 00:23:14,405 |
2100 and returns a wreck expression. |
2099 and returns a regular expression. |
2101 |
2100 |
2102 458 |
2101 458 |
2103 00:23:14,405 --> 00:23:16,340 |
2102 00:23:14,405 --> 00:23:16,340 |
2104 So now we just have to analyze |
2103 So now we just have to analyze |
2105 |
2104 |
2106 459 |
2105 459 |
2107 00:23:16,340 --> 00:23:18,890 |
2106 00:23:16,340 --> 00:23:18,890 |
2108 what's that reg |
2107 what's that regular |
2109 expression looks like. |
2108 expression looks like. |
2110 |
2109 |
2111 460 |
2110 460 |
2112 00:23:18,890 --> 00:23:23,870 |
2111 00:23:18,890 --> 00:23:23,870 |
2113 If it's 0, we return |
2112 If it's 0, we return |
2114 01, we return 0. |
2113 0; if it's 1 we return 0. |
2115 |
2114 |
2116 461 |
2115 461 |
2117 00:23:23,870 --> 00:23:26,990 |
2116 00:23:23,870 --> 00:23:26,990 |
2118 If the character is the |
2117 If the character... If the |
2119 wreck expressions character |
2118 regular expressions character |
2120 |
2119 |
2121 462 |
2120 462 |
2122 00:23:26,990 --> 00:23:30,260 |
2121 00:23:26,990 --> 00:23:30,260 |
2123 d. We test whether it's |
2122 d, we test whether it's |
2124 equal to this character. |
2123 equal to this character |
2125 |
2124 |
2126 463 |
2125 463 |
2127 00:23:30,260 --> 00:23:32,225 |
2126 00:23:30,260 --> 00:23:32,225 |
2128 We want to take the |
2127 we want to take the |
2129 derivative off. |
2128 derivative off. |
2130 |
2129 |
2131 464 |
2130 464 |
2132 00:23:32,225 --> 00:23:36,185 |
2131 00:23:32,225 --> 00:23:36,185 |
2133 If yes, return one, otherwise 0. |
2132 If yes, return 1, otherwise 0. |
2134 |
2133 |
2135 465 |
2134 465 |
2136 00:23:36,185 --> 00:23:38,600 |
2135 00:23:36,185 --> 00:23:38,600 |
2137 The alternative which has pushed |
2136 The alternative which has pushed |
2138 |
2137 |
2159 If it's not the have to push |
2158 If it's not the have to push |
2160 |
2159 |
2161 471 |
2160 471 |
2162 00:23:49,040 --> 00:23:52,160 |
2161 00:23:49,040 --> 00:23:52,160 |
2163 the derivative over |
2162 the derivative over |
2164 the first DR1. |
2163 the first, the r1. |
2165 |
2164 |
2166 472 |
2165 472 |
2167 00:23:52,160 --> 00:23:56,135 |
2166 00:23:52,160 --> 00:23:56,135 |
2168 And otherwise if it is null |
2167 And otherwise if it is nullable, |
2169 above we have two cases. |
2168 we have two cases. |
2170 |
2169 |
2171 473 |
2170 473 |
2172 00:23:56,135 --> 00:24:00,605 |
2171 00:23:56,135 --> 00:24:00,605 |
2173 Either you have to push |
2172 Either you have to push |
2174 it over the R1 or R2. |
2173 it over the r1 or r2. |
2175 |
2174 |
2176 474 |
2175 474 |
2177 00:24:00,605 --> 00:24:03,860 |
2176 00:24:00,605 --> 00:24:03,860 |
2178 And a star case, this one. |
2177 And a star case, this one. |
2179 |
2178 |
2182 We just build the sequence |
2181 We just build the sequence |
2183 of the derivative of |
2182 of the derivative of |
2184 |
2183 |
2185 476 |
2184 476 |
2186 00:24:07,160 --> 00:24:11,600 |
2185 00:24:07,160 --> 00:24:11,600 |
2187 R1 and append the |
2186 r1 and append the |
2188 or as a sequence, |
2187 r1 as a sequence, |
2189 |
2188 |
2190 477 |
2189 477 |
2191 00:24:11,600 --> 00:24:14,195 |
2190 00:24:11,600 --> 00:24:14,195 |
2192 put the star at the end. |
2191 put the star at the end. |
2193 |
2192 |
2194 478 |
2193 478 |
2195 00:24:14,195 --> 00:24:19,430 |
2194 00:24:14,195 --> 00:24:19,430 |
2196 Okay, so here's this |
2195 Okay, so here's the |
2197 function for strings. |
2196 function for strings. |
2198 |
2197 |
2199 479 |
2198 479 |
2200 00:24:19,430 --> 00:24:21,740 |
2199 00:24:19,430 --> 00:24:21,740 |
2201 So a list of characters. |
2200 So a list of characters. |
2202 |
2201 |
2203 480 |
2202 480 |
2204 00:24:21,740 --> 00:24:23,870 |
2203 00:24:21,740 --> 00:24:23,870 |
2205 This function takes N, |
2204 This function takes an s, |
2206 |
2205 |
2207 481 |
2206 481 |
2208 00:24:23,870 --> 00:24:26,480 |
2207 00:24:23,870 --> 00:24:26,480 |
2209 S list of characters argument |
2208 a list of characters as argument |
2210 and a reg expression |
2209 and a regular expression |
2211 |
2210 |
2212 482 |
2211 482 |
2213 00:24:26,480 --> 00:24:29,360 |
2212 00:24:26,480 --> 00:24:29,360 |
2214 as argument and returns |
2213 as argument and returns |
2215 a wreck expression. |
2214 a regular expression. |
2216 |
2215 |
2217 483 |
2216 483 |
2218 00:24:29,360 --> 00:24:31,460 |
2217 00:24:29,360 --> 00:24:31,460 |
2219 And we just have to |
2218 And we just have to |
2220 pattern match what |
2219 pattern match what |
2229 If it's the empty list, |
2228 If it's the empty list, |
2230 |
2229 |
2231 486 |
2230 486 |
2232 00:24:35,360 --> 00:24:38,030 |
2231 00:24:35,360 --> 00:24:38,030 |
2233 it just immediately |
2232 it just immediately |
2234 return the R. If |
2233 return the r. If |
2235 |
2234 |
2236 487 |
2235 487 |
2237 00:24:38,030 --> 00:24:42,020 |
2236 00:24:38,030 --> 00:24:42,020 |
2238 it's something of the |
2237 it's something of the |
2239 form C followed by S, |
2238 form c followed by s, |
2240 |
2239 |
2241 488 |
2240 488 |
2242 00:24:42,020 --> 00:24:45,050 |
2241 00:24:42,020 --> 00:24:45,050 |
2243 then we build first the |
2242 then we build first the |
2244 derivative according |
2243 derivative according |
2245 |
2244 |
2246 489 |
2245 489 |
2247 00:24:45,050 --> 00:24:48,345 |
2246 00:24:45,050 --> 00:24:48,345 |
2248 to a C. And then |
2247 to c. And then |
2249 the result of that, |
2248 the result of that, |
2250 |
2249 |
2251 490 |
2250 490 |
2252 00:24:48,345 --> 00:24:50,090 |
2251 00:24:48,345 --> 00:24:50,090 |
2253 people recursively call |
2252 we recursively call |
2254 |
2253 |
2255 491 |
2254 491 |
2256 00:24:50,090 --> 00:24:53,675 |
2255 00:24:50,090 --> 00:24:53,675 |
2257 the derivative |
2256 the derivative |
2258 according to s. Okay? |
2257 according to s. Okay? |
2259 |
2258 |
2260 492 |
2259 492 |
2261 00:24:53,675 --> 00:24:56,060 |
2260 00:24:53,675 --> 00:24:56,060 |
2262 And now the main mantra, |
2261 And now the main matcher, |
2263 |
2262 |
2264 493 |
2263 493 |
2265 00:24:56,060 --> 00:24:59,360 |
2264 00:24:56,060 --> 00:24:59,360 |
2266 it takes a reg |
2265 it takes a regular |
2267 expression and takes |
2266 expression and takes |
2268 |
2267 |
2269 494 |
2268 494 |
2270 00:24:59,360 --> 00:25:02,870 |
2269 00:24:59,360 --> 00:25:02,870 |
2271 a string and returns a |
2270 a string and returns a |
2281 The only thing I have to do here |
2280 The only thing I have to do here |
2282 |
2281 |
2283 497 |
2282 497 |
2284 00:25:07,430 --> 00:25:09,410 |
2283 00:25:07,430 --> 00:25:09,410 |
2285 because I'm implementing |
2284 because I'm implementing |
2286 it and scholar, |
2285 it in Scala, |
2287 |
2286 |
2288 498 |
2287 498 |
2289 00:25:09,410 --> 00:25:15,064 |
2288 00:25:09,410 --> 00:25:15,064 |
2290 I have to coerce between strings |
2289 I have to coerce between strings |
2291 and lists of characters. |
2290 and lists of characters. |
2294 00:25:15,064 --> 00:25:16,580 |
2293 00:25:15,064 --> 00:25:16,580 |
2295 So he, I take first |
2294 So he, I take first |
2296 |
2295 |
2297 500 |
2296 500 |
2298 00:25:16,580 --> 00:25:20,810 |
2297 00:25:16,580 --> 00:25:20,810 |
2299 the two list of the S that |
2298 the toList of the s. That |
2300 gives me a list of characters. |
2299 gives me a list of characters. |
2301 |
2300 |
2302 501 |
2301 501 |
2303 00:25:20,810 --> 00:25:23,135 |
2302 00:25:20,810 --> 00:25:23,135 |
2304 Then I build this derivative |
2303 Then I build this derivative |
2305 |
2304 |
2306 502 |
2305 502 |
2307 00:25:23,135 --> 00:25:26,045 |
2306 00:25:23,135 --> 00:25:26,045 |
2308 is ds because I'm |
2307 with the s, because I'm |
2309 looking at strings. |
2308 looking at strings. |
2310 |
2309 |
2311 503 |
2310 503 |
2312 00:25:26,045 --> 00:25:28,160 |
2311 00:25:26,045 --> 00:25:28,160 |
2313 And then at the end, |
2312 And then at the end, |
2314 |
2313 |
2315 504 |
2314 504 |
2316 00:25:28,160 --> 00:25:33,050 |
2315 00:25:28,160 --> 00:25:33,050 |
2317 built-in nullable of |
2316 built the nullable of |
2318 the final derivative. |
2317 the final derivative. |
2319 |
2318 |
2320 505 |
2319 505 |
2321 00:25:33,050 --> 00:25:37,310 |
2320 00:25:33,050 --> 00:25:37,310 |
2322 The beauty of all this |
2321 The beauty of all this |
2363 what's the derivative |
2362 what's the derivative |
2364 according to a, |
2363 according to a, |
2365 |
2364 |
2366 515 |
2365 515 |
2367 00:26:00,305 --> 00:26:02,720 |
2366 00:26:00,305 --> 00:26:02,720 |
2368 B, and C of this |
2367 b, and c of this |
2369 record expression. |
2368 regular expression. |
2370 |
2369 |
2371 516 |
2370 516 |
2372 00:26:02,720 --> 00:26:07,040 |
2371 00:26:02,720 --> 00:26:07,040 |
2373 And I hope you didn't |
2372 And I hope you didn't |
2374 make any mistake. |
2373 make any mistake. |
2375 |
2374 |
2376 517 |
2375 517 |
2377 00:26:07,040 --> 00:26:09,260 |
2376 00:26:07,040 --> 00:26:09,260 |
2378 Now of course we want |
2377 Now of course we want |
2379 to see whether B |
2378 to see whether we |
2380 |
2379 |
2381 518 |
2380 518 |
2382 00:26:09,260 --> 00:26:11,390 |
2381 00:26:09,260 --> 00:26:11,390 |
2383 do any better with |
2382 do any better with |
2384 this algorithm. |
2383 this algorithm... |
2385 |
2384 |
2386 519 |
2385 519 |
2387 00:26:11,390 --> 00:26:13,715 |
2386 00:26:11,390 --> 00:26:13,715 |
2388 Then Python and Ruby. |
2387 than Python and Ruby. |
2389 |
2388 |
2390 520 |
2389 520 |
2391 00:26:13,715 --> 00:26:16,070 |
2390 00:26:13,715 --> 00:26:16,070 |
2392 And let's first have a |
2391 And let's first have a |
2393 look at this example. |
2392 look at this example. |
2394 |
2393 |
2395 521 |
2394 521 |
2396 00:26:16,070 --> 00:26:18,079 |
2395 00:26:16,070 --> 00:26:18,079 |
2397 So this reg expression. |
2396 So this regular expression. |
2398 |
2397 |
2399 522 |
2398 522 |
2400 00:26:18,079 --> 00:26:19,880 |
2399 00:26:18,079 --> 00:26:19,880 |
2401 The problem is that in |
2400 The problem is that in |
2402 |
2401 |
2403 523 |
2402 523 |
2404 00:26:19,880 --> 00:26:22,070 |
2403 00:26:19,880 --> 00:26:22,070 |
2405 our reg expression |
2404 our regular expression |
2406 metro so far we have |
2405 matcher so far we have |
2407 |
2406 |
2408 524 |
2407 524 |
2409 00:26:22,070 --> 00:26:24,530 |
2408 00:26:22,070 --> 00:26:24,530 |
2410 the sequence right |
2409 the sequence rregular |
2411 expression where we |
2410 expression, but we |
2412 |
2411 |
2413 525 |
2412 525 |
2414 00:26:24,530 --> 00:26:27,200 |
2413 00:26:24,530 --> 00:26:27,200 |
2415 don't have this optional |
2414 don't have this optional |
2416 wreck expression. |
2415 regular expression. |
2417 |
2416 |
2418 526 |
2417 526 |
2419 00:26:27,200 --> 00:26:30,800 |
2418 00:26:27,200 --> 00:26:30,800 |
2420 And we don't have this n |
2419 And we don't have this n |
2421 times Rekha expression, |
2420 times regular expression, |
2422 |
2421 |
2423 527 |
2422 527 |
2424 00:26:30,800 --> 00:26:36,605 |
2423 00:26:30,800 --> 00:26:36,605 |
2425 but we can quickly implement |
2424 but we can quickly implement |
2426 that in our implementation. |
2425 that in our implementation. |
2427 |
2426 |
2428 528 |
2427 528 |
2429 00:26:36,605 --> 00:26:40,549 |
2428 00:26:36,605 --> 00:26:40,549 |
2430 So if you want to build the |
2429 So if you want to build the |
2431 optional reg expression, |
2430 optional regular expression, |
2432 |
2431 |
2433 529 |
2432 529 |
2434 00:26:40,549 --> 00:26:41,870 |
2433 00:26:40,549 --> 00:26:41,870 |
2435 which is defined here, |
2434 which is defined here, |
2436 |
2435 |
2439 a little function which |
2438 a little function which |
2440 takes a reg expression and |
2439 takes a reg expression and |
2441 |
2440 |
2442 531 |
2441 531 |
2443 00:26:44,570 --> 00:26:47,360 |
2442 00:26:44,570 --> 00:26:47,360 |
2444 returns the alternative of R one. |
2443 returns the alternative of r and 1. |
2445 |
2444 |
2446 532 |
2445 532 |
2447 00:26:47,360 --> 00:26:49,624 |
2446 00:26:47,360 --> 00:26:49,624 |
2448 So it essentially |
2447 So it essentially |
2449 takes the definition |
2448 takes the definition |
2450 |
2449 |
2451 533 |
2450 533 |
2452 00:26:49,624 --> 00:26:53,240 |
2451 00:26:49,624 --> 00:26:53,240 |
2453 of optional R and |
2452 of optional r and |
2454 replaces it by that. |
2453 replaces it by that. |
2455 |
2454 |
2456 534 |
2455 534 |
2457 00:26:53,240 --> 00:26:56,150 |
2456 00:26:53,240 --> 00:26:56,150 |
2458 The end times what we |
2457 The n-times what we |
2459 essentially do it, |
2458 essentially do there is |
2460 |
2459 |
2461 535 |
2460 535 |
2462 00:26:56,150 --> 00:27:01,535 |
2461 00:26:56,150 --> 00:27:01,535 |
2463 There's beaches built n |
2462 we built n copies of this r. Ok? |
2464 copies of this r. Ok? |
|
2465 |
2463 |
2466 536 |
2464 536 |
2467 00:27:01,535 --> 00:27:04,745 |
2465 00:27:01,535 --> 00:27:04,745 |
2468 So if this n times was 0, |
2466 So if this n-times was 0, |
2469 |
2467 |
2470 537 |
2468 537 |
2471 00:27:04,745 --> 00:27:06,245 |
2469 00:27:04,745 --> 00:27:06,245 |
2472 we just return one. |
2470 we just return one. |
2473 |
2471 |
2474 538 |
2472 538 |
2475 00:27:06,245 --> 00:27:11,570 |
2473 00:27:06,245 --> 00:27:11,570 |
2476 If it's one, then we return |
2474 If it's one, then we return |
2477 just the reg expression. |
2475 just the regular expression. |
2478 |
2476 |
2479 539 |
2477 539 |
2480 00:27:11,570 --> 00:27:15,575 |
2478 00:27:11,570 --> 00:27:15,575 |
2481 And if it's a, something |
2479 And if it's something |
2482 bigger than one, |
2480 bigger than one, |
2483 |
2481 |
2484 540 |
2482 540 |
2485 00:27:15,575 --> 00:27:18,560 |
2483 00:27:15,575 --> 00:27:18,560 |
2486 then we just built the |
2484 then we just built the |
2491 these copies and call |
2489 these copies and call |
2492 |
2490 |
2493 542 |
2491 542 |
2494 00:27:20,270 --> 00:27:22,925 |
2492 00:27:20,270 --> 00:27:22,925 |
2495 this function again |
2493 this function again |
2496 with n minus one. |
2494 with n - 1. |
2497 |
2495 |
2498 543 |
2496 543 |
2499 00:27:22,925 --> 00:27:26,330 |
2497 00:27:22,925 --> 00:27:26,330 |
2500 So we just now n copies of that. |
2498 So we just build now n-copies of that. |
2501 |
2499 |
2502 544 |
2500 544 |
2503 00:27:26,330 --> 00:27:30,710 |
2501 00:27:26,330 --> 00:27:30,710 |
2504 Okay? Okay, so we can look |
2502 Okay, so we can look |
2505 at our first example. |
2503 at our first example. |
2506 |
2504 |
2507 545 |
2505 545 |
2508 00:27:30,710 --> 00:27:32,629 |
2506 00:27:30,710 --> 00:27:32,629 |
2509 This is the work expression, |
2507 This is the regular expression, |
2510 |
2508 |
2511 546 |
2509 546 |
2512 00:27:32,629 --> 00:27:36,035 |
2510 00:27:32,629 --> 00:27:36,035 |
2513 and I call that here |
2511 and I call that here |
2514 even rec expression1. |
2512 evil regular expression1. |
2515 |
2513 |
2516 547 |
2514 547 |
2517 00:27:36,035 --> 00:27:37,670 |
2515 00:27:36,035 --> 00:27:37,670 |
2518 It doesn't look as beautiful |
2516 It doesn't look as beautiful |
2519 |
2517 |
2531 if this can be improved. |
2529 if this can be improved. |
2532 But here it is. |
2530 But here it is. |
2533 |
2531 |
2534 551 |
2532 551 |
2535 00:27:43,640 --> 00:27:45,724 |
2533 00:27:43,640 --> 00:27:45,724 |
2536 Here's the reg expression, |
2534 Here's the regular expression, |
2537 |
2535 |
2538 552 |
2536 552 |
2539 00:27:45,724 --> 00:27:49,520 |
2537 00:27:45,724 --> 00:27:49,520 |
2540 and here's a little function |
2538 and here's a little function |
2541 which measures the time. |
2539 which measures the time. |
2544 00:27:49,520 --> 00:27:53,180 |
2542 00:27:49,520 --> 00:27:53,180 |
2545 And we can now start testing |
2543 And we can now start testing |
2546 |
2544 |
2547 554 |
2545 554 |
2548 00:27:53,180 --> 00:27:57,845 |
2546 00:27:53,180 --> 00:27:57,845 |
2549 this reg expression with |
2547 this regular expression with |
2550 strings of just containing A's. |
2548 strings just containing a's. |
2551 |
2549 |
2552 555 |
2550 555 |
2553 00:27:57,845 --> 00:28:00,020 |
2551 00:27:57,845 --> 00:28:00,020 |
2554 And we first a bit cautious, |
2552 And we are first a bit cautious, |
2555 |
2553 |
2556 556 |
2554 556 |
2557 00:28:00,020 --> 00:28:04,985 |
2555 00:28:00,020 --> 00:28:04,985 |
2558 be tested between 020 |
2556 we test it between 0 and 20, |
2559 and be count by two. |
2557 and we count by two. |
2560 |
2558 |
2561 557 |
2559 557 |
2562 00:28:04,985 --> 00:28:08,330 |
2560 00:28:04,985 --> 00:28:08,330 |
2563 And I actually will not |
2561 And I actually will not |
2564 start that with Scala, |
2562 start that within Scala, |
2565 |
2563 |
2566 558 |
2564 558 |
2567 00:28:08,330 --> 00:28:12,965 |
2565 00:28:08,330 --> 00:28:12,965 |
2568 but actually the orbital online. |
2566 but actually use Ammonite. |
2569 |
2567 |
2570 559 |
2568 559 |
2571 00:28:12,965 --> 00:28:15,305 |
2569 00:28:12,965 --> 00:28:15,305 |
2572 And that's out. |
2570 And that's out. |
2573 |
2571 |
2574 560 |
2572 560 |
2575 00:28:15,305 --> 00:28:17,269 |
2573 00:28:15,305 --> 00:28:17,269 |
2576 And that correlates. |
2574 And that calculates |
2577 |
2575 |
2578 561 |
2576 561 |
2579 00:28:17,269 --> 00:28:20,675 |
2577 00:28:17,269 --> 00:28:20,675 |
2580 So for 18 days it's pretty fast. |
2578 for 18 a's. It's pretty fast. |
2581 |
2579 |
2582 562 |
2580 562 |
2583 00:28:20,675 --> 00:28:24,815 |
2581 00:28:20,675 --> 00:28:24,815 |
2584 But for 20 it already |
2582 But for 20 it already |
2585 needs to seconds. |
2583 needs 2 seconds. |
2586 |
2584 |
2587 563 |
2585 563 |
2588 00:28:24,815 --> 00:28:28,265 |
2586 00:28:24,815 --> 00:28:28,265 |
2589 And you can see |
2587 And you can see |
2590 actually this jump from |
2588 actually this jump from |
2591 |
2589 |
2592 564 |
2590 564 |
2593 00:28:28,265 --> 00:28:32,825 |
2591 00:28:28,265 --> 00:28:32,825 |
2594 18 to 20 in the runtime |
2592 18 to 20 in the runtime |
2595 is prepared to. |
2593 is pretty bad too. |
2596 |
2594 |
2597 565 |
2595 565 |
2598 00:28:32,825 --> 00:28:37,460 |
2596 00:28:32,825 --> 00:28:37,460 |
2599 And if we actually measure |
2597 And if we actually measure |
2600 it more accurately, |
2598 it more accurately, |
2607 00:28:39,770 --> 00:28:41,255 |
2605 00:28:39,770 --> 00:28:41,255 |
2608 And what turns out, |
2606 And what turns out, |
2609 |
2607 |
2610 568 |
2608 568 |
2611 00:28:41,255 --> 00:28:45,830 |
2609 00:28:41,255 --> 00:28:45,830 |
2612 we actually inverse as Python |
2610 we are actually worse than Python |
2613 and Ruby in this example. |
2611 and Ruby in this example. |
2614 |
2612 |
2615 569 |
2613 569 |
2616 00:28:45,830 --> 00:28:49,230 |
2614 00:28:45,830 --> 00:28:49,230 |
2617 So I guess that's a fail. |
2615 So I guess that's a fail. |