videos/02-proof.srt
changeset 776 939c10745a3a
equal deleted inserted replaced
775:16aafc7691d8 776:939c10745a3a
       
     1 1
       
     2 00:00:06,260 --> 00:00:10,275
       
     3 I come back, I know
       
     4 the topic of proofs
       
     5 
       
     6 2
       
     7 00:00:10,275 --> 00:00:13,950
       
     8 is usually not the most
       
     9 favorite topics with students.
       
    10 
       
    11 3
       
    12 00:00:13,950 --> 00:00:17,220
       
    13 But I'm really passionate
       
    14 about proofs in
       
    15 
       
    16 4
       
    17 00:00:17,220 --> 00:00:19,320
       
    18 this module because they really
       
    19 
       
    20 5
       
    21 00:00:19,320 --> 00:00:21,885
       
    22 help you with understanding
       
    23 what's going on.
       
    24 
       
    25 6
       
    26 00:00:21,885 --> 00:00:25,170
       
    27 And very often they help
       
    28 you with preventing errors.
       
    29 
       
    30 7
       
    31 00:00:25,170 --> 00:00:28,695
       
    32 You seen me earlier doing
       
    33 these calculations.
       
    34 
       
    35 8
       
    36 00:00:28,695 --> 00:00:30,210
       
    37 And if I hadn't had
       
    38 
       
    39 9
       
    40 00:00:30,210 --> 00:00:33,330
       
    41 the safety net of
       
    42 knowing what I'm doing,
       
    43 
       
    44 10
       
    45 00:00:33,330 --> 00:00:35,130
       
    46 I could have easily gone wrong.
       
    47 
       
    48 11
       
    49 00:00:35,130 --> 00:00:36,390
       
    50 And that's actually a problem is
       
    51 
       
    52 12
       
    53 00:00:36,390 --> 00:00:38,935
       
    54 existing wreck Expression
       
    55 Matching engines.
       
    56 
       
    57 13
       
    58 00:00:38,935 --> 00:00:41,330
       
    59 Sometimes in corner
       
    60 cases as we seen,
       
    61 
       
    62 14
       
    63 00:00:41,330 --> 00:00:43,130
       
    64 they produce slow results,
       
    65 
       
    66 15
       
    67 00:00:43,130 --> 00:00:45,455
       
    68 which is actually not so bad.
       
    69 
       
    70 16
       
    71 00:00:45,455 --> 00:00:48,020
       
    72 Much worse is that
       
    73 we also know they
       
    74 
       
    75 17
       
    76 00:00:48,020 --> 00:00:50,570
       
    77 sometimes produce
       
    78 incorrect results.
       
    79 
       
    80 18
       
    81 00:00:50,570 --> 00:00:51,980
       
    82 So for us here,
       
    83 
       
    84 19
       
    85 00:00:51,980 --> 00:00:53,390
       
    86 the proofs will be
       
    87 
       
    88 20
       
    89 00:00:53,390 --> 00:00:55,100
       
    90 essentially the safety nets
       
    91 
       
    92 21
       
    93 00:00:55,100 --> 00:00:58,295
       
    94 of making sure that our
       
    95 algorithms actually correct.
       
    96 
       
    97 22
       
    98 00:00:58,295 --> 00:01:00,095
       
    99 And that's actually
       
   100 the beauty of that.
       
   101 
       
   102 23
       
   103 00:01:00,095 --> 00:01:03,170
       
   104 It doesn't happen very often
       
   105 in real life that we can
       
   106 
       
   107 24
       
   108 00:01:03,170 --> 00:01:06,500
       
   109 actually prove the
       
   110 correctness of our algorithm.
       
   111 
       
   112 25
       
   113 00:01:06,500 --> 00:01:08,405
       
   114 But we can do this here.
       
   115 
       
   116 26
       
   117 00:01:08,405 --> 00:01:11,000
       
   118 So we will start
       
   119 a bit slow first,
       
   120 
       
   121 27
       
   122 00:01:11,000 --> 00:01:14,330
       
   123 I will show you how
       
   124 properties are stated.
       
   125 
       
   126 28
       
   127 00:01:14,330 --> 00:01:17,255
       
   128 And then along the way
       
   129 we will see how proofs
       
   130 
       
   131 29
       
   132 00:01:17,255 --> 00:01:20,940
       
   133 actually can be performed
       
   134 and how they can help you.
       
   135 
       
   136 30
       
   137 00:01:20,980 --> 00:01:25,205
       
   138 You can probably remember this
       
   139 slide from the beginning.
       
   140 
       
   141 31
       
   142 00:01:25,205 --> 00:01:29,300
       
   143 Now we would probably stayed
       
   144 the same property that
       
   145 
       
   146 32
       
   147 00:01:29,300 --> 00:01:32,420
       
   148 our algorithm matches
       
   149 the specification
       
   150 
       
   151 33
       
   152 00:01:32,420 --> 00:01:33,695
       
   153 of what that means.
       
   154 
       
   155 34
       
   156 00:01:33,695 --> 00:01:35,450
       
   157 Maybe slightly more mathematical,
       
   158 
       
   159 35
       
   160 00:01:35,450 --> 00:01:37,130
       
   161 we would write
       
   162 something like this.
       
   163 
       
   164 36
       
   165 00:01:37,130 --> 00:01:39,545
       
   166 If our Marcia is given as
       
   167 
       
   168 37
       
   169 00:01:39,545 --> 00:01:42,740
       
   170 a string direct expression
       
   171 R and says yes,
       
   172 
       
   173 38
       
   174 00:01:42,740 --> 00:01:45,110
       
   175 then this string
       
   176 really needs to be in
       
   177 
       
   178 39
       
   179 00:01:45,110 --> 00:01:48,350
       
   180 the language of R. And
       
   181 if the matcher says no,
       
   182 
       
   183 40
       
   184 00:01:48,350 --> 00:01:50,120
       
   185 then this string cannot be in
       
   186 
       
   187 41
       
   188 00:01:50,120 --> 00:01:52,325
       
   189 the language of r. And we spent
       
   190 
       
   191 42
       
   192 00:01:52,325 --> 00:01:54,590
       
   193 quite some effort to make precise
       
   194 
       
   195 43
       
   196 00:01:54,590 --> 00:01:57,065
       
   197 what the language is
       
   198 of a wreck expression.
       
   199 
       
   200 44
       
   201 00:01:57,065 --> 00:01:59,090
       
   202 And we spend some effort to
       
   203 
       
   204 45
       
   205 00:01:59,090 --> 00:02:01,505
       
   206 make precise what
       
   207 our algorithm is.
       
   208 
       
   209 46
       
   210 00:02:01,505 --> 00:02:02,840
       
   211 So we could now take
       
   212 
       
   213 47
       
   214 00:02:02,840 --> 00:02:06,875
       
   215 this property and we could
       
   216 generate some test cases,
       
   217 
       
   218 48
       
   219 00:02:06,875 --> 00:02:09,410
       
   220 and that is all fine and good.
       
   221 
       
   222 49
       
   223 00:02:09,410 --> 00:02:13,340
       
   224 Testing is a very important
       
   225 thing for our software.
       
   226 
       
   227 50
       
   228 00:02:13,340 --> 00:02:16,715
       
   229 But we can only do
       
   230 this testing so far.
       
   231 
       
   232 51
       
   233 00:02:16,715 --> 00:02:19,385
       
   234 Remember in the homework,
       
   235 
       
   236 52
       
   237 00:02:19,385 --> 00:02:21,275
       
   238 there was a single string,
       
   239 
       
   240 53
       
   241 00:02:21,275 --> 00:02:23,750
       
   242 abcd, and there were
       
   243 actually infinitely
       
   244 
       
   245 54
       
   246 00:02:23,750 --> 00:02:27,530
       
   247 many wreck expressions which
       
   248 can match only that string.
       
   249 
       
   250 55
       
   251 00:02:27,530 --> 00:02:30,470
       
   252 So testing something for
       
   253 infinitely many things,
       
   254 
       
   255 56
       
   256 00:02:30,470 --> 00:02:32,000
       
   257 that's a bit hard.
       
   258 
       
   259 57
       
   260 00:02:32,000 --> 00:02:36,710
       
   261 And Summa, as soon as we have
       
   262 an expression with a star,
       
   263 
       
   264 58
       
   265 00:02:36,710 --> 00:02:39,289
       
   266 this l function returns,
       
   267 
       
   268 59
       
   269 00:02:39,289 --> 00:02:41,360
       
   270 in the general case
       
   271 an infinite set.
       
   272 
       
   273 60
       
   274 00:02:41,360 --> 00:02:43,940
       
   275 And testing whether a
       
   276 string is an infinite set,
       
   277 
       
   278 61
       
   279 00:02:43,940 --> 00:02:45,500
       
   280 that's also a bit hard.
       
   281 
       
   282 62
       
   283 00:02:45,500 --> 00:02:47,510
       
   284 So testing can only do so much.
       
   285 
       
   286 63
       
   287 00:02:47,510 --> 00:02:49,820
       
   288 We could generate some
       
   289 wreck expressions
       
   290 
       
   291 64
       
   292 00:02:49,820 --> 00:02:51,560
       
   293 and we can generate some strings.
       
   294 
       
   295 65
       
   296 00:02:51,560 --> 00:02:53,990
       
   297 And we can test
       
   298 whether that is true.
       
   299 
       
   300 66
       
   301 00:02:53,990 --> 00:02:57,470
       
   302 So B would be still
       
   303 in the quandary that
       
   304 
       
   305 67
       
   306 00:02:57,470 --> 00:03:00,950
       
   307 maybe there's a corner
       
   308 case which doesn't work.
       
   309 
       
   310 68
       
   311 00:03:00,950 --> 00:03:03,110
       
   312 So what we want to
       
   313 achieve instead
       
   314 
       
   315 69
       
   316 00:03:03,110 --> 00:03:05,615
       
   317 is you want to really make sure
       
   318 
       
   319 70
       
   320 00:03:05,615 --> 00:03:07,325
       
   321 that this algorithm works for
       
   322 
       
   323 71
       
   324 00:03:07,325 --> 00:03:10,850
       
   325 all reg expression
       
   326 and for all strings.
       
   327 
       
   328 72
       
   329 00:03:10,850 --> 00:03:16,400
       
   330 And that ONE proving will
       
   331 help us to establish.
       
   332 
       
   333 73
       
   334 00:03:16,400 --> 00:03:19,040
       
   335 Testing cannot do that.
       
   336 
       
   337 74
       
   338 00:03:19,040 --> 00:03:22,310
       
   339 Remember, metro
       
   340 mainly consisted of
       
   341 
       
   342 75
       
   343 00:03:22,310 --> 00:03:25,444
       
   344 these functions,
       
   345 nullable and derivative.
       
   346 
       
   347 76
       
   348 00:03:25,444 --> 00:03:27,755
       
   349 So if you want to prove
       
   350 anything about Metro.
       
   351 
       
   352 77
       
   353 00:03:27,755 --> 00:03:30,200
       
   354 Bet on, make sure that
       
   355 we know what actually
       
   356 
       
   357 78
       
   358 00:03:30,200 --> 00:03:33,890
       
   359 the specification of
       
   360 nullable and derivative is.
       
   361 
       
   362 79
       
   363 00:03:33,890 --> 00:03:35,750
       
   364 That's actually quite
       
   365 instructive to also
       
   366 
       
   367 80
       
   368 00:03:35,750 --> 00:03:37,850
       
   369 understand how they work.
       
   370 
       
   371 81
       
   372 00:03:37,850 --> 00:03:39,695
       
   373 So let's do that next.
       
   374 
       
   375 82
       
   376 00:03:39,695 --> 00:03:41,975
       
   377 The central property of nullable,
       
   378 
       
   379 83
       
   380 00:03:41,975 --> 00:03:43,445
       
   381 Actually that's not so hard.
       
   382 
       
   383 84
       
   384 00:03:43,445 --> 00:03:48,740
       
   385 It's just says a function or
       
   386 a reg expression is nullable
       
   387 
       
   388 85
       
   389 00:03:48,740 --> 00:03:50,360
       
   390 if and only if
       
   391 
       
   392 86
       
   393 00:03:50,360 --> 00:03:52,160
       
   394 the empty string is in
       
   395 
       
   396 87
       
   397 00:03:52,160 --> 00:03:54,410
       
   398 the language that was the
       
   399 purpose of this novel,
       
   400 
       
   401 88
       
   402 00:03:54,410 --> 00:03:55,670
       
   403 it takes work, expression and
       
   404 
       
   405 89
       
   406 00:03:55,670 --> 00:03:58,490
       
   407 test whether can match
       
   408 the empty string.
       
   409 
       
   410 90
       
   411 00:03:58,490 --> 00:04:01,010
       
   412 Meaning this empty string needs
       
   413 
       
   414 91
       
   415 00:04:01,010 --> 00:04:03,230
       
   416 to be in the language
       
   417 of R. And again,
       
   418 
       
   419 92
       
   420 00:04:03,230 --> 00:04:04,385
       
   421 we have this if an only,
       
   422 
       
   423 93
       
   424 00:04:04,385 --> 00:04:05,885
       
   425 if this one says yes,
       
   426 
       
   427 94
       
   428 00:04:05,885 --> 00:04:08,465
       
   429 then the empty string needs
       
   430 to be in this language.
       
   431 
       
   432 95
       
   433 00:04:08,465 --> 00:04:10,385
       
   434 And if this nullable says no,
       
   435 
       
   436 96
       
   437 00:04:10,385 --> 00:04:13,085
       
   438 the empty string CoMP
       
   439 unless language.
       
   440 
       
   441 97
       
   442 00:04:13,085 --> 00:04:15,290
       
   443 So that's relatively easy.
       
   444 
       
   445 98
       
   446 00:04:15,290 --> 00:04:18,110
       
   447 The next one might be also
       
   448 very instructive to look
       
   449 
       
   450 99
       
   451 00:04:18,110 --> 00:04:22,775
       
   452 at what's the property the
       
   453 derivative should satisfy.
       
   454 
       
   455 100
       
   456 00:04:22,775 --> 00:04:26,329
       
   457 And remember, the derivative
       
   458 was a kind of operation.
       
   459 
       
   460 101
       
   461 00:04:26,329 --> 00:04:28,850
       
   462 We taking a wrecker
       
   463 expression and we're looking
       
   464 
       
   465 102
       
   466 00:04:28,850 --> 00:04:31,955
       
   467 at all the strings this
       
   468 reg expression can match,
       
   469 
       
   470 103
       
   471 00:04:31,955 --> 00:04:35,510
       
   472 but which starts with
       
   473 C. And then me somehow
       
   474 
       
   475 104
       
   476 00:04:35,510 --> 00:04:37,250
       
   477 looking for reg expression
       
   478 which can match
       
   479 
       
   480 105
       
   481 00:04:37,250 --> 00:04:39,335
       
   482 the same string starting with C,
       
   483 
       
   484 106
       
   485 00:04:39,335 --> 00:04:42,440
       
   486 except that this
       
   487 c is chopped off.
       
   488 
       
   489 107
       
   490 00:04:42,440 --> 00:04:47,150
       
   491 And that actually saves
       
   492 what this property states.
       
   493 
       
   494 108
       
   495 00:04:47,150 --> 00:04:50,300
       
   496 Take the, take a
       
   497 break expression are,
       
   498 
       
   499 109
       
   500 00:04:50,300 --> 00:04:53,045
       
   501 and build the derivative
       
   502 according to see,
       
   503 
       
   504 110
       
   505 00:04:53,045 --> 00:04:56,150
       
   506 okay, so this gives
       
   507 us a reg expression,
       
   508 
       
   509 111
       
   510 00:04:56,150 --> 00:04:58,025
       
   511 the derivative of that
       
   512 
       
   513 112
       
   514 00:04:58,025 --> 00:05:01,775
       
   515 r. Now we take the
       
   516 a language of that,
       
   517 
       
   518 113
       
   519 00:05:01,775 --> 00:05:06,620
       
   520 are that supposed to be
       
   521 the same set as if I take
       
   522 
       
   523 114
       
   524 00:05:06,620 --> 00:05:12,845
       
   525 the language of R and then
       
   526 build the semantic derivative.
       
   527 
       
   528 115
       
   529 00:05:12,845 --> 00:05:16,055
       
   530 Remember the semantic derivative
       
   531 was doing exactly that.
       
   532 
       
   533 116
       
   534 00:05:16,055 --> 00:05:18,380
       
   535 It goes through all
       
   536 the strings in this
       
   537 
       
   538 117
       
   539 00:05:18,380 --> 00:05:21,350
       
   540 that filters out everything which
       
   541 
       
   542 118
       
   543 00:05:21,350 --> 00:05:24,770
       
   544 doesn't start with C.
       
   545 So throws them away of
       
   546 
       
   547 119
       
   548 00:05:24,770 --> 00:05:26,510
       
   549 the remaining strings
       
   550 which start with
       
   551 
       
   552 120
       
   553 00:05:26,510 --> 00:05:28,670
       
   554 C. It chops off so that C,
       
   555 
       
   556 121
       
   557 00:05:28,670 --> 00:05:31,370
       
   558 So this will be exactly
       
   559 
       
   560 122
       
   561 00:05:31,370 --> 00:05:34,205
       
   562 what our derivative
       
   563 is supposed to do.
       
   564 
       
   565 123
       
   566 00:05:34,205 --> 00:05:36,605
       
   567 And yes, if you can
       
   568 establish that,
       
   569 
       
   570 124
       
   571 00:05:36,605 --> 00:05:38,450
       
   572 then the language of
       
   573 
       
   574 125
       
   575 00:05:38,450 --> 00:05:43,355
       
   576 this derivative operation
       
   577 will produce the same sat.
       
   578 
       
   579 126
       
   580 00:05:43,355 --> 00:05:45,815
       
   581 This actually also
       
   582 how I'm going to
       
   583 
       
   584 127
       
   585 00:05:45,815 --> 00:05:49,760
       
   586 test in cases icons
       
   587 yet at the beginning,
       
   588 
       
   589 128
       
   590 00:05:49,760 --> 00:05:52,910
       
   591 how your coursework is correct.
       
   592 
       
   593 129
       
   594 00:05:52,910 --> 00:05:55,625
       
   595 If you give me a
       
   596 definition for derivative
       
   597 
       
   598 130
       
   599 00:05:55,625 --> 00:05:58,730
       
   600 of one of these bounded
       
   601 record expressions,
       
   602 
       
   603 131
       
   604 00:05:58,730 --> 00:06:01,085
       
   605 and I can't see
       
   606 whether it's correct.
       
   607 
       
   608 132
       
   609 00:06:01,085 --> 00:06:03,380
       
   610 Then what it will do
       
   611 is I will just try to
       
   612 
       
   613 133
       
   614 00:06:03,380 --> 00:06:05,675
       
   615 establish that this
       
   616 property holds.
       
   617 
       
   618 134
       
   619 00:06:05,675 --> 00:06:07,265
       
   620 And if I can establish that,
       
   621 
       
   622 135
       
   623 00:06:07,265 --> 00:06:08,870
       
   624 then you coursework is find.
       
   625 
       
   626 136
       
   627 00:06:08,870 --> 00:06:10,550
       
   628 And if I can't establish that,
       
   629 
       
   630 137
       
   631 00:06:10,550 --> 00:06:12,410
       
   632 then usually it shows
       
   633 
       
   634 138
       
   635 00:06:12,410 --> 00:06:14,360
       
   636 there is a counterexample
       
   637 somewhere.
       
   638 
       
   639 139
       
   640 00:06:14,360 --> 00:06:19,220
       
   641 Because this derivative,
       
   642 as you've seen desk times,
       
   643 
       
   644 140
       
   645 00:06:19,220 --> 00:06:21,725
       
   646 more than one solution
       
   647 you can give.
       
   648 
       
   649 141
       
   650 00:06:21,725 --> 00:06:23,915
       
   651 Some of them are more
       
   652 efficient than others.
       
   653 
       
   654 142
       
   655 00:06:23,915 --> 00:06:25,685
       
   656 But in terms of correctness,
       
   657 
       
   658 143
       
   659 00:06:25,685 --> 00:06:30,680
       
   660 there are multiple solutions
       
   661 for the coursework.
       
   662 
       
   663 144
       
   664 00:06:30,680 --> 00:06:34,145
       
   665 And in the past I really had
       
   666 to scratch my head and say,
       
   667 
       
   668 145
       
   669 00:06:34,145 --> 00:06:36,995
       
   670 Is that definition
       
   671 really correct?
       
   672 
       
   673 146
       
   674 00:06:36,995 --> 00:06:39,620
       
   675 And so what I did is
       
   676 I just went ahead and
       
   677 
       
   678 147
       
   679 00:06:39,620 --> 00:06:42,785
       
   680 tried to prove that or
       
   681 establish this property.
       
   682 
       
   683 148
       
   684 00:06:42,785 --> 00:06:45,110
       
   685 The reason why we writing
       
   686 
       
   687 149
       
   688 00:06:45,110 --> 00:06:47,300
       
   689 these properties
       
   690 down so precisely,
       
   691 
       
   692 150
       
   693 00:06:47,300 --> 00:06:49,880
       
   694 remember we know here
       
   695 what the derivative is,
       
   696 
       
   697 151
       
   698 00:06:49,880 --> 00:06:51,290
       
   699 we know what the
       
   700 languages we know
       
   701 
       
   702 152
       
   703 00:06:51,290 --> 00:06:53,495
       
   704 what this semantic derivative is.
       
   705 
       
   706 153
       
   707 00:06:53,495 --> 00:06:56,960
       
   708 This is usually when you are
       
   709 trying to prove something.
       
   710 
       
   711 154
       
   712 00:06:56,960 --> 00:06:58,715
       
   713 And very important step.
       
   714 
       
   715 155
       
   716 00:06:58,715 --> 00:07:01,385
       
   717 If you know what you're
       
   718 actually trying to prove,
       
   719 
       
   720 156
       
   721 00:07:01,385 --> 00:07:05,510
       
   722 then you have already
       
   723 done half of the proof.
       
   724 
       
   725 157
       
   726 00:07:05,510 --> 00:07:08,300
       
   727 Remember, we want to
       
   728 establish these properties
       
   729 
       
   730 158
       
   731 00:07:08,300 --> 00:07:10,460
       
   732 not just for some reg expression,
       
   733 
       
   734 159
       
   735 00:07:10,460 --> 00:07:13,940
       
   736 but for all wrecker expression
       
   737 and for all characters.
       
   738 
       
   739 160
       
   740 00:07:13,940 --> 00:07:17,225
       
   741 So proving is the only
       
   742 method to achieve that.
       
   743 
       
   744 161
       
   745 00:07:17,225 --> 00:07:18,590
       
   746 And I said at the beginning of
       
   747 
       
   748 162
       
   749 00:07:18,590 --> 00:07:21,170
       
   750 this video's proving is
       
   751 
       
   752 163
       
   753 00:07:21,170 --> 00:07:23,810
       
   754 not the most favorite
       
   755 subject of students.
       
   756 
       
   757 164
       
   758 00:07:23,810 --> 00:07:25,730
       
   759 But I can promise you,
       
   760 
       
   761 165
       
   762 00:07:25,730 --> 00:07:29,630
       
   763 if you just follow a very
       
   764 simple minded recipe,
       
   765 
       
   766 166
       
   767 00:07:29,630 --> 00:07:33,695
       
   768 that brings, you're already
       
   769 very far in these proofs.
       
   770 
       
   771 167
       
   772 00:07:33,695 --> 00:07:35,720
       
   773 Of course, there's
       
   774 always some creativity
       
   775 
       
   776 168
       
   777 00:07:35,720 --> 00:07:37,070
       
   778 involved in these proofs.
       
   779 
       
   780 169
       
   781 00:07:37,070 --> 00:07:39,380
       
   782 But a simple-minded
       
   783 recipe you can
       
   784 
       
   785 170
       
   786 00:07:39,380 --> 00:07:42,620
       
   787 follow brings you're already
       
   788 very close to the end.
       
   789 
       
   790 171
       
   791 00:07:42,620 --> 00:07:46,610
       
   792 So let's have a look how
       
   793 this recipe can be derived.
       
   794 
       
   795 172
       
   796 00:07:46,610 --> 00:07:48,740
       
   797 Establishing a property is for
       
   798 
       
   799 173
       
   800 00:07:48,740 --> 00:07:50,900
       
   801 all Rekha expression nearly
       
   802 
       
   803 174
       
   804 00:07:50,900 --> 00:07:55,805
       
   805 always requires an induction
       
   806 over wreck expressions.
       
   807 
       
   808 175
       
   809 00:07:55,805 --> 00:07:57,350
       
   810 And so we have to look at what's
       
   811 
       
   812 176
       
   813 00:07:57,350 --> 00:07:59,150
       
   814 the recipe for doing inductions
       
   815 
       
   816 177
       
   817 00:07:59,150 --> 00:08:02,390
       
   818 over wreck expressions
       
   819 and inductions.
       
   820 
       
   821 178
       
   822 00:08:02,390 --> 00:08:03,590
       
   823 You probably remember from
       
   824 
       
   825 179
       
   826 00:08:03,590 --> 00:08:07,100
       
   827 school time you did induction
       
   828 over natural numbers.
       
   829 
       
   830 180
       
   831 00:08:07,100 --> 00:08:11,120
       
   832 You had to establish that
       
   833 some property holds for 0.
       
   834 
       
   835 181
       
   836 00:08:11,120 --> 00:08:13,775
       
   837 And you had to establish it
       
   838 
       
   839 182
       
   840 00:08:13,775 --> 00:08:14,960
       
   841 that if you assume that
       
   842 
       
   843 183
       
   844 00:08:14,960 --> 00:08:16,970
       
   845 the property holds
       
   846 for the N case,
       
   847 
       
   848 184
       
   849 00:08:16,970 --> 00:08:21,365
       
   850 it has to also hold for
       
   851 the n plus one case.
       
   852 
       
   853 185
       
   854 00:08:21,365 --> 00:08:24,350
       
   855 And induction over wreck
       
   856 expressions is very,
       
   857 
       
   858 186
       
   859 00:08:24,350 --> 00:08:26,810
       
   860 very similar to this
       
   861 kind of induction.
       
   862 
       
   863 187
       
   864 00:08:26,810 --> 00:08:30,754
       
   865 Just that in case of
       
   866 break expressions,
       
   867 
       
   868 188
       
   869 00:08:30,754 --> 00:08:33,200
       
   870 we have essentially
       
   871 three base cases.
       
   872 
       
   873 189
       
   874 00:08:33,200 --> 00:08:35,360
       
   875 So in our grandma
       
   876 or we say it rec,
       
   877 
       
   878 190
       
   879 00:08:35,360 --> 00:08:38,705
       
   880 expressions are 01
       
   881 and characters.
       
   882 
       
   883 191
       
   884 00:08:38,705 --> 00:08:42,260
       
   885 And these will be the base
       
   886 cases in our induction.
       
   887 
       
   888 192
       
   889 00:08:42,260 --> 00:08:43,895
       
   890 So we have to establish
       
   891 
       
   892 193
       
   893 00:08:43,895 --> 00:08:46,475
       
   894 any property we want
       
   895 to prove by induction,
       
   896 
       
   897 194
       
   898 00:08:46,475 --> 00:08:49,055
       
   899 you have to show
       
   900 that it holds for 0,
       
   901 
       
   902 195
       
   903 00:08:49,055 --> 00:08:50,495
       
   904 it holds for one,
       
   905 
       
   906 196
       
   907 00:08:50,495 --> 00:08:52,505
       
   908 and it holds for the character.
       
   909 
       
   910 197
       
   911 00:08:52,505 --> 00:08:57,350
       
   912 So there's nothing else we
       
   913 can assume in these cases.
       
   914 
       
   915 198
       
   916 00:08:57,350 --> 00:08:59,300
       
   917 We just have to prove
       
   918 that this property
       
   919 
       
   920 199
       
   921 00:08:59,300 --> 00:09:01,835
       
   922 holds in these three base cases.
       
   923 
       
   924 200
       
   925 00:09:01,835 --> 00:09:04,490
       
   926 It is very much the
       
   927 same as if you proved
       
   928 
       
   929 201
       
   930 00:09:04,490 --> 00:09:06,050
       
   931 something over natural numbers
       
   932 
       
   933 202
       
   934 00:09:06,050 --> 00:09:08,270
       
   935 and you had to prove it for 0.
       
   936 
       
   937 203
       
   938 00:09:08,270 --> 00:09:10,880
       
   939 Now, the sequence case,
       
   940 
       
   941 204
       
   942 00:09:10,880 --> 00:09:12,755
       
   943 the alternative in a star.
       
   944 
       
   945 205
       
   946 00:09:12,755 --> 00:09:15,815
       
   947 These are the kind
       
   948 of inductive cases.
       
   949 
       
   950 206
       
   951 00:09:15,815 --> 00:09:17,750
       
   952 Remember in natural numbers,
       
   953 
       
   954 207
       
   955 00:09:17,750 --> 00:09:20,450
       
   956 you could assume that
       
   957 a property holds for
       
   958 
       
   959 208
       
   960 00:09:20,450 --> 00:09:24,770
       
   961 n and then you had to
       
   962 establish it for n plus one.
       
   963 
       
   964 209
       
   965 00:09:24,770 --> 00:09:26,945
       
   966 This is here very simmer.
       
   967 
       
   968 210
       
   969 00:09:26,945 --> 00:09:29,000
       
   970 You can assume that
       
   971 
       
   972 211
       
   973 00:09:29,000 --> 00:09:30,920
       
   974 these properties hold for
       
   975 
       
   976 212
       
   977 00:09:30,920 --> 00:09:33,140
       
   978 the components of
       
   979 these reg expression.
       
   980 
       
   981 213
       
   982 00:09:33,140 --> 00:09:35,840
       
   983 So you can assume that it
       
   984 holds for this on one,
       
   985 
       
   986 214
       
   987 00:09:35,840 --> 00:09:38,960
       
   988 and you can assume that
       
   989 it holds for this R2.
       
   990 
       
   991 215
       
   992 00:09:38,960 --> 00:09:41,090
       
   993 And your task is now to
       
   994 
       
   995 216
       
   996 00:09:41,090 --> 00:09:44,240
       
   997 establish this property
       
   998 also for sequences.
       
   999 
       
  1000 217
       
  1001 00:09:44,240 --> 00:09:46,255
       
  1002 The same for alternative soup.
       
  1003 
       
  1004 218
       
  1005 00:09:46,255 --> 00:09:48,710
       
  1006 Assume that it holds
       
  1007 already for R1.
       
  1008 
       
  1009 219
       
  1010 00:09:48,710 --> 00:09:51,005
       
  1011 You can assume that
       
  1012 it holds for the R2.
       
  1013 
       
  1014 220
       
  1015 00:09:51,005 --> 00:09:53,930
       
  1016 You have to establish that
       
  1017 this property holds also for
       
  1018 
       
  1019 221
       
  1020 00:09:53,930 --> 00:09:58,460
       
  1021 the entire alternative,
       
  1022 the same four star.
       
  1023 
       
  1024 222
       
  1025 00:09:58,460 --> 00:10:00,440
       
  1026 So there you can assume it for r,
       
  1027 
       
  1028 223
       
  1029 00:10:00,440 --> 00:10:02,330
       
  1030 but you have to now
       
  1031 establish it that
       
  1032 
       
  1033 224
       
  1034 00:10:02,330 --> 00:10:04,760
       
  1035 it also holds for r star.
       
  1036 
       
  1037 225
       
  1038 00:10:04,760 --> 00:10:07,910
       
  1039 Ok, that leads to this
       
  1040 simple-minded recipe
       
  1041 
       
  1042 226
       
  1043 00:10:07,910 --> 00:10:09,560
       
  1044 which you can always follow.
       
  1045 
       
  1046 227
       
  1047 00:10:09,560 --> 00:10:10,970
       
  1048 So first you have to know
       
  1049 
       
  1050 228
       
  1051 00:10:10,970 --> 00:10:13,085
       
  1052 what's the property
       
  1053 you want to show.
       
  1054 
       
  1055 229
       
  1056 00:10:13,085 --> 00:10:16,790
       
  1057 Luckily for you, that's
       
  1058 almost always given by me.
       
  1059 
       
  1060 230
       
  1061 00:10:16,790 --> 00:10:19,100
       
  1062 So there's nothing
       
  1063 to think about it.
       
  1064 
       
  1065 231
       
  1066 00:10:19,100 --> 00:10:22,100
       
  1067 But then what you have to
       
  1068 do in an induction proof,
       
  1069 
       
  1070 232
       
  1071 00:10:22,100 --> 00:10:23,855
       
  1072 you have to show that
       
  1073 this property holds
       
  1074 
       
  1075 233
       
  1076 00:10:23,855 --> 00:10:26,345
       
  1077 for 041 and characters.
       
  1078 
       
  1079 234
       
  1080 00:10:26,345 --> 00:10:28,790
       
  1081 These are the base
       
  1082 cases in the induction.
       
  1083 
       
  1084 235
       
  1085 00:10:28,790 --> 00:10:31,280
       
  1086 And then in the induction cases,
       
  1087 
       
  1088 236
       
  1089 00:10:31,280 --> 00:10:32,555
       
  1090 you just have to say, well,
       
  1091 
       
  1092 237
       
  1093 00:10:32,555 --> 00:10:34,535
       
  1094 I know that this
       
  1095 property holds for one,
       
  1096 
       
  1097 238
       
  1098 00:10:34,535 --> 00:10:36,980
       
  1099 I know that the
       
  1100 property holds for R2.
       
  1101 
       
  1102 239
       
  1103 00:10:36,980 --> 00:10:38,720
       
  1104 What does it look like that
       
  1105 
       
  1106 240
       
  1107 00:10:38,720 --> 00:10:40,895
       
  1108 this property holds
       
  1109 for the alternative,
       
  1110 
       
  1111 241
       
  1112 00:10:40,895 --> 00:10:44,480
       
  1113 for the sequence
       
  1114 and for the r-star,
       
  1115 
       
  1116 242
       
  1117 00:10:44,480 --> 00:10:46,955
       
  1118 that will then requires
       
  1119 some reasoning.
       
  1120 
       
  1121 243
       
  1122 00:10:46,955 --> 00:10:48,230
       
  1123 But if you know what you can
       
  1124 
       
  1125 244
       
  1126 00:10:48,230 --> 00:10:50,359
       
  1127 assume and what do you
       
  1128 have to establish?
       
  1129 
       
  1130 245
       
  1131 00:10:50,359 --> 00:10:52,940
       
  1132 Often can already see bought
       
  1133 
       
  1134 246
       
  1135 00:10:52,940 --> 00:10:55,490
       
  1136 needs to be proved and
       
  1137 what's the argument like?
       
  1138 
       
  1139 247
       
  1140 00:10:55,490 --> 00:10:58,325
       
  1141 So let's try that and
       
  1142 then concrete example.
       
  1143 
       
  1144 248
       
  1145 00:10:58,325 --> 00:11:01,265
       
  1146 So how does approve
       
  1147 work in real life?
       
  1148 
       
  1149 249
       
  1150 00:11:01,265 --> 00:11:03,530
       
  1151 And say that the beginning
       
  1152 first you have to know
       
  1153 
       
  1154 250
       
  1155 00:11:03,530 --> 00:11:05,630
       
  1156 what's the property
       
  1157 you want to prove.
       
  1158 
       
  1159 251
       
  1160 00:11:05,630 --> 00:11:10,280
       
  1161 So in our case, the P of R is if
       
  1162 
       
  1163 252
       
  1164 00:11:10,280 --> 00:11:14,900
       
  1165 nullable of R holds if
       
  1166 
       
  1167 253
       
  1168 00:11:14,900 --> 00:11:19,760
       
  1169 and only if the empty string
       
  1170 is in the language of.
       
  1171 
       
  1172 254
       
  1173 00:11:19,760 --> 00:11:23,690
       
  1174 So we want to do an induction
       
  1175 of rec expressions.
       
  1176 
       
  1177 255
       
  1178 00:11:23,690 --> 00:11:28,400
       
  1179 So we better have Fan property
       
  1180 which depends on this
       
  1181 
       
  1182 256
       
  1183 00:11:28,400 --> 00:11:31,130
       
  1184 R. And so now in
       
  1185 
       
  1186 257
       
  1187 00:11:31,130 --> 00:11:34,985
       
  1188 each case I would start with
       
  1189 a blank piece of paper.
       
  1190 
       
  1191 258
       
  1192 00:11:34,985 --> 00:11:38,030
       
  1193 And I would write on the top,
       
  1194 
       
  1195 259
       
  1196 00:11:38,030 --> 00:11:40,190
       
  1197 what are my assumptions?
       
  1198 
       
  1199 260
       
  1200 00:11:40,190 --> 00:11:42,470
       
  1201 And on the bottom, I would write
       
  1202 
       
  1203 261
       
  1204 00:11:42,470 --> 00:11:45,485
       
  1205 down what I need to prove.
       
  1206 
       
  1207 262
       
  1208 00:11:45,485 --> 00:11:49,280
       
  1209 And then I would try
       
  1210 to somehow navigate
       
  1211 
       
  1212 263
       
  1213 00:11:49,280 --> 00:11:53,540
       
  1214 my way from these assumptions
       
  1215 to what I want to prove,
       
  1216 
       
  1217 264
       
  1218 00:11:53,540 --> 00:11:57,740
       
  1219 to somehow infer
       
  1220 small steps so that I
       
  1221 
       
  1222 265
       
  1223 00:11:57,740 --> 00:11:59,960
       
  1224 can assure that what I'm trying
       
  1225 
       
  1226 266
       
  1227 00:11:59,960 --> 00:12:02,780
       
  1228 to prove holds for
       
  1229 my assumptions.
       
  1230 
       
  1231 267
       
  1232 00:12:02,780 --> 00:12:06,905
       
  1233 Obviously. That depends
       
  1234 on how hard that case is.
       
  1235 
       
  1236 268
       
  1237 00:12:06,905 --> 00:12:08,930
       
  1238 And if you look at
       
  1239 our first case,
       
  1240 
       
  1241 269
       
  1242 00:12:08,930 --> 00:12:10,835
       
  1243 what we have to show is if to
       
  1244 
       
  1245 270
       
  1246 00:12:10,835 --> 00:12:13,655
       
  1247 prove that this P of 0 holds.
       
  1248 
       
  1249 271
       
  1250 00:12:13,655 --> 00:12:15,410
       
  1251 And actually no assumption in
       
  1252 
       
  1253 272
       
  1254 00:12:15,410 --> 00:12:17,450
       
  1255 this case because it's
       
  1256 one of the base cases.
       
  1257 
       
  1258 273
       
  1259 00:12:17,450 --> 00:12:23,015
       
  1260 So what I have to show is I
       
  1261 have to show that nullable of
       
  1262 
       
  1263 274
       
  1264 00:12:23,015 --> 00:12:29,570
       
  1265 0 if and only if the empty
       
  1266 string is in L of 0.
       
  1267 
       
  1268 275
       
  1269 00:12:29,570 --> 00:12:31,340
       
  1270 Okay? And I have to
       
  1271 
       
  1272 276
       
  1273 00:12:31,340 --> 00:12:34,190
       
  1274 somehow show that this
       
  1275 implementation of nullable in
       
  1276 
       
  1277 277
       
  1278 00:12:34,190 --> 00:12:35,600
       
  1279 this case produces
       
  1280 
       
  1281 278
       
  1282 00:12:35,600 --> 00:12:39,185
       
  1283 exactly the same answer
       
  1284 as that specification.
       
  1285 
       
  1286 279
       
  1287 00:12:39,185 --> 00:12:41,660
       
  1288 Okay? So what is nullable of 0 is
       
  1289 
       
  1290 280
       
  1291 00:12:41,660 --> 00:12:45,605
       
  1292 defined where debt is
       
  1293 just false by definition.
       
  1294 
       
  1295 281
       
  1296 00:12:45,605 --> 00:12:50,225
       
  1297 And what is this
       
  1298 language of LCA defined?
       
  1299 
       
  1300 282
       
  1301 00:12:50,225 --> 00:12:52,535
       
  1302 Well, that is just
       
  1303 the empty string.
       
  1304 
       
  1305 283
       
  1306 00:12:52,535 --> 00:12:55,010
       
  1307 In the empty language.
       
  1308 
       
  1309 284
       
  1310 00:12:55,010 --> 00:12:57,110
       
  1311 Is the empty string and
       
  1312 the empty language,
       
  1313 
       
  1314 285
       
  1315 00:12:57,110 --> 00:12:59,240
       
  1316 no, that is false too.
       
  1317 
       
  1318 286
       
  1319 00:12:59,240 --> 00:13:01,610
       
  1320 So I have shown that
       
  1321 
       
  1322 287
       
  1323 00:13:01,610 --> 00:13:03,890
       
  1324 both of them produce
       
  1325 the same result.
       
  1326 
       
  1327 288
       
  1328 00:13:03,890 --> 00:13:08,765
       
  1329 So I would already be done
       
  1330 in the first base case.
       
  1331 
       
  1332 289
       
  1333 00:13:08,765 --> 00:13:12,125
       
  1334 The second base case is not
       
  1335 so much more difficult.
       
  1336 
       
  1337 290
       
  1338 00:13:12,125 --> 00:13:15,455
       
  1339 We want to prove
       
  1340 that P of one holds.
       
  1341 
       
  1342 291
       
  1343 00:13:15,455 --> 00:13:21,680
       
  1344 And that would say
       
  1345 if nullable of one,
       
  1346 
       
  1347 292
       
  1348 00:13:21,680 --> 00:13:23,825
       
  1349 if and only if
       
  1350 
       
  1351 293
       
  1352 00:13:23,825 --> 00:13:27,665
       
  1353 the empty string is in
       
  1354 the language of one.
       
  1355 
       
  1356 294
       
  1357 00:13:27,665 --> 00:13:31,235
       
  1358 Okay? I can again look
       
  1359 at what's that defined,
       
  1360 
       
  1361 295
       
  1362 00:13:31,235 --> 00:13:33,275
       
  1363 that is defined as true.
       
  1364 
       
  1365 296
       
  1366 00:13:33,275 --> 00:13:35,450
       
  1367 And what is this language to
       
  1368 
       
  1369 297
       
  1370 00:13:35,450 --> 00:13:37,640
       
  1371 find where that is defined as
       
  1372 
       
  1373 298
       
  1374 00:13:37,640 --> 00:13:39,320
       
  1375 the empty string in
       
  1376 
       
  1377 299
       
  1378 00:13:39,320 --> 00:13:42,230
       
  1379 the language containing
       
  1380 the empty string.
       
  1381 
       
  1382 300
       
  1383 00:13:42,230 --> 00:13:44,150
       
  1384 And is the empty string in
       
  1385 
       
  1386 301
       
  1387 00:13:44,150 --> 00:13:46,055
       
  1388 the language containing
       
  1389 the empty string?
       
  1390 
       
  1391 302
       
  1392 00:13:46,055 --> 00:13:47,975
       
  1393 Yes, that's true as well.
       
  1394 
       
  1395 303
       
  1396 00:13:47,975 --> 00:13:50,735
       
  1397 So again, I've shown that
       
  1398 the actual implementation
       
  1399 
       
  1400 304
       
  1401 00:13:50,735 --> 00:13:55,250
       
  1402 produces the same result
       
  1403 as the specification.
       
  1404 
       
  1405 305
       
  1406 00:13:55,250 --> 00:13:57,635
       
  1407 I leave the character base case
       
  1408 
       
  1409 306
       
  1410 00:13:57,635 --> 00:14:00,690
       
  1411 for you to fill in the details.
       
  1412 
       
  1413 307
       
  1414 00:14:01,150 --> 00:14:03,320
       
  1415 More interesting are these
       
  1416 
       
  1417 308
       
  1418 00:14:03,320 --> 00:14:05,300
       
  1419 inductive cases as
       
  1420 you can imagine.
       
  1421 
       
  1422 309
       
  1423 00:14:05,300 --> 00:14:08,360
       
  1424 So let's try to prove
       
  1425 it for the alternative.
       
  1426 
       
  1427 310
       
  1428 00:14:08,360 --> 00:14:10,639
       
  1429 So I want to show a property
       
  1430 
       
  1431 311
       
  1432 00:14:10,639 --> 00:14:13,790
       
  1433 for this record expression way.
       
  1434 
       
  1435 312
       
  1436 00:14:13,790 --> 00:14:15,410
       
  1437 What does our property save
       
  1438 
       
  1439 313
       
  1440 00:14:15,410 --> 00:14:16,970
       
  1441 for that reg expression
       
  1442 where we have to
       
  1443 
       
  1444 314
       
  1445 00:14:16,970 --> 00:14:20,915
       
  1446 say that nullable of
       
  1447 
       
  1448 315
       
  1449 00:14:20,915 --> 00:14:25,490
       
  1450 R1 plus R2 if and only if
       
  1451 
       
  1452 316
       
  1453 00:14:25,490 --> 00:14:31,519
       
  1454 the empty string is in the
       
  1455 language if R1 plus R2.
       
  1456 
       
  1457 317
       
  1458 00:14:31,519 --> 00:14:35,090
       
  1459 Ok? So I would now go ahead with
       
  1460 
       
  1461 318
       
  1462 00:14:35,090 --> 00:14:39,695
       
  1463 my plank piece of paper and
       
  1464 I would write down on top.
       
  1465 
       
  1466 319
       
  1467 00:14:39,695 --> 00:14:41,475
       
  1468 Now what can I assume?
       
  1469 
       
  1470 320
       
  1471 00:14:41,475 --> 00:14:45,170
       
  1472 Well, I can assume
       
  1473 that P of R1 holds and
       
  1474 
       
  1475 321
       
  1476 00:14:45,170 --> 00:14:49,175
       
  1477 I can assume what
       
  1478 p of r two holds.
       
  1479 
       
  1480 322
       
  1481 00:14:49,175 --> 00:14:50,750
       
  1482 And let's just write that down.
       
  1483 
       
  1484 323
       
  1485 00:14:50,750 --> 00:14:56,210
       
  1486 We can say that nullable of R1 if
       
  1487 
       
  1488 324
       
  1489 00:14:56,210 --> 00:14:58,655
       
  1490 and only if the empty string
       
  1491 
       
  1492 325
       
  1493 00:14:58,655 --> 00:15:02,000
       
  1494 is in the language of
       
  1495 R1, I can assume that.
       
  1496 
       
  1497 326
       
  1498 00:15:02,000 --> 00:15:03,500
       
  1499 And I can assume
       
  1500 
       
  1501 327
       
  1502 00:15:03,500 --> 00:15:10,355
       
  1503 that none above our two if
       
  1504 and only if the empty string.
       
  1505 
       
  1506 328
       
  1507 00:15:10,355 --> 00:15:14,675
       
  1508 Or to Hawaii. What's now,
       
  1509 
       
  1510 329
       
  1511 00:15:14,675 --> 00:15:15,830
       
  1512 what do I want to prove?
       
  1513 
       
  1514 330
       
  1515 00:15:15,830 --> 00:15:18,455
       
  1516 Where I want to
       
  1517 prove this property?
       
  1518 
       
  1519 331
       
  1520 00:15:18,455 --> 00:15:21,935
       
  1521 So let me write this here
       
  1522 again on the bottom.
       
  1523 
       
  1524 332
       
  1525 00:15:21,935 --> 00:15:30,935
       
  1526 So p of R1 plus R2
       
  1527 is narrow form of R1
       
  1528 
       
  1529 333
       
  1530 00:15:30,935 --> 00:15:34,595
       
  1531 plus R2 if and only if
       
  1532 
       
  1533 334
       
  1534 00:15:34,595 --> 00:15:40,610
       
  1535 the empty string is in the
       
  1536 language of R1 plus R2.
       
  1537 
       
  1538 335
       
  1539 00:15:40,610 --> 00:15:43,280
       
  1540 Well, the first thing I can do is
       
  1541 
       
  1542 336
       
  1543 00:15:43,280 --> 00:15:46,760
       
  1544 I can infer what's the
       
  1545 definition of that?
       
  1546 
       
  1547 337
       
  1548 00:15:46,760 --> 00:15:51,920
       
  1549 Where that would
       
  1550 be nullable of R1
       
  1551 
       
  1552 338
       
  1553 00:15:51,920 --> 00:15:57,530
       
  1554 or nullable of our two.
       
  1555 
       
  1556 339
       
  1557 00:15:57,530 --> 00:15:59,960
       
  1558 That's how it was defined, okay?
       
  1559 
       
  1560 340
       
  1561 00:15:59,960 --> 00:16:01,730
       
  1562 And what can I do with this?
       
  1563 
       
  1564 341
       
  1565 00:16:01,730 --> 00:16:04,700
       
  1566 I can also look at how is
       
  1567 
       
  1568 342
       
  1569 00:16:04,700 --> 00:16:08,510
       
  1570 this language to find
       
  1571 where that is defined as
       
  1572 
       
  1573 343
       
  1574 00:16:08,510 --> 00:16:16,730
       
  1575 the empty string in the
       
  1576 language of R1 union R2.
       
  1577 
       
  1578 344
       
  1579 00:16:16,730 --> 00:16:19,910
       
  1580 Okay? So what does that mean?
       
  1581 
       
  1582 345
       
  1583 00:16:19,910 --> 00:16:23,585
       
  1584 Well, Venice, the empty
       
  1585 string in a union.
       
  1586 
       
  1587 346
       
  1588 00:16:23,585 --> 00:16:27,095
       
  1589 Well, it has to be
       
  1590 the case that either
       
  1591 
       
  1592 347
       
  1593 00:16:27,095 --> 00:16:31,190
       
  1594 the empty string
       
  1595 is in L of R one,
       
  1596 
       
  1597 348
       
  1598 00:16:31,190 --> 00:16:37,670
       
  1599 all the empty string
       
  1600 is o. L of R two.
       
  1601 
       
  1602 349
       
  1603 00:16:37,670 --> 00:16:43,715
       
  1604 Ok, so that's, I just
       
  1605 applied all the definitions.
       
  1606 
       
  1607 350
       
  1608 00:16:43,715 --> 00:16:47,930
       
  1609 And now I have to look better
       
  1610 my induction hypothesis.
       
  1611 
       
  1612 351
       
  1613 00:16:47,930 --> 00:16:50,930
       
  1614 These somehow help.
       
  1615 
       
  1616 352
       
  1617 00:16:50,930 --> 00:16:55,820
       
  1618 Well, I know nullable
       
  1619 of R1 holds if
       
  1620 
       
  1621 353
       
  1622 00:16:55,820 --> 00:17:00,815
       
  1623 and only if the empty
       
  1624 string is in L of one.
       
  1625 
       
  1626 354
       
  1627 00:17:00,815 --> 00:17:05,330
       
  1628 So that is by the induction
       
  1629 hypothesis, a hypothesis.
       
  1630 
       
  1631 355
       
  1632 00:17:05,330 --> 00:17:07,670
       
  1633 And I also know by this,
       
  1634 
       
  1635 356
       
  1636 00:17:07,670 --> 00:17:10,040
       
  1637 that this holds only if
       
  1638 
       
  1639 357
       
  1640 00:17:10,040 --> 00:17:17,465
       
  1641 the empty string is in R2
       
  1642 and there's an in-between.
       
  1643 
       
  1644 358
       
  1645 00:17:17,465 --> 00:17:19,580
       
  1646 Now you can see again,
       
  1647 
       
  1648 359
       
  1649 00:17:19,580 --> 00:17:23,690
       
  1650 this is what my specification
       
  1651 say It must hold.
       
  1652 
       
  1653 360
       
  1654 00:17:23,690 --> 00:17:25,640
       
  1655 And this is essentially
       
  1656 
       
  1657 361
       
  1658 00:17:25,640 --> 00:17:28,685
       
  1659 what I could infer from
       
  1660 my implementation.
       
  1661 
       
  1662 362
       
  1663 00:17:28,685 --> 00:17:30,140
       
  1664 And they are the same.
       
  1665 
       
  1666 363
       
  1667 00:17:30,140 --> 00:17:32,270
       
  1668 So also this case,
       
  1669 
       
  1670 364
       
  1671 00:17:32,270 --> 00:17:35,520
       
  1672 I've shown my property holds.
       
  1673 
       
  1674 365
       
  1675 00:17:35,740 --> 00:17:39,275
       
  1676 Just for fun. Let's do this
       
  1677 also for the sequence.
       
  1678 
       
  1679 366
       
  1680 00:17:39,275 --> 00:17:43,835
       
  1681 So we want to show
       
  1682 P of 014 by two.
       
  1683 
       
  1684 367
       
  1685 00:17:43,835 --> 00:17:51,050
       
  1686 And that means I have to
       
  1687 prove that nullable of R1 by
       
  1688 
       
  1689 368
       
  1690 00:17:51,050 --> 00:17:54,095
       
  1691 R2 if and only if
       
  1692 
       
  1693 369
       
  1694 00:17:54,095 --> 00:18:00,440
       
  1695 the empty string is
       
  1696 one followed by two.
       
  1697 
       
  1698 370
       
  1699 00:18:00,440 --> 00:18:04,880
       
  1700 Ok? And in this case
       
  1701 I'm just writing
       
  1702 
       
  1703 371
       
  1704 00:18:04,880 --> 00:18:09,695
       
  1705 my induction
       
  1706 hypothesis on the top.
       
  1707 
       
  1708 372
       
  1709 00:18:09,695 --> 00:18:11,060
       
  1710 And that would be,
       
  1711 
       
  1712 373
       
  1713 00:18:11,060 --> 00:18:13,460
       
  1714 I know that p of r one holds.
       
  1715 
       
  1716 374
       
  1717 00:18:13,460 --> 00:18:18,290
       
  1718 So another ball of R one
       
  1719 
       
  1720 375
       
  1721 00:18:18,290 --> 00:18:24,860
       
  1722 if and only if the empty
       
  1723 string is in L one.
       
  1724 
       
  1725 376
       
  1726 00:18:24,860 --> 00:18:29,525
       
  1727 And the P of R to say
       
  1728 is that I know already
       
  1729 
       
  1730 377
       
  1731 00:18:29,525 --> 00:18:36,815
       
  1732 nullable of R2 if and only
       
  1733 if the empty string then R2.
       
  1734 
       
  1735 378
       
  1736 00:18:36,815 --> 00:18:39,230
       
  1737 Okay? And I know some hard to
       
  1738 
       
  1739 379
       
  1740 00:18:39,230 --> 00:18:42,285
       
  1741 infer that under
       
  1742 these assumptions,
       
  1743 
       
  1744 380
       
  1745 00:18:42,285 --> 00:18:45,970
       
  1746 this property, like
       
  1747 in the earlier case,
       
  1748 
       
  1749 381
       
  1750 00:18:45,970 --> 00:18:48,070
       
  1751 I just have a look how
       
  1752 
       
  1753 382
       
  1754 00:18:48,070 --> 00:18:51,145
       
  1755 everything is defined and
       
  1756 see if I can go from there.
       
  1757 
       
  1758 383
       
  1759 00:18:51,145 --> 00:18:56,485
       
  1760 So nullable of the
       
  1761 sequence would be nullable
       
  1762 
       
  1763 384
       
  1764 00:18:56,485 --> 00:19:04,555
       
  1765 of R1 and nullable of R2.
       
  1766 
       
  1767 385
       
  1768 00:19:04,555 --> 00:19:08,559
       
  1769 Okay? By the
       
  1770 induction hypothesis,
       
  1771 
       
  1772 386
       
  1773 00:19:08,559 --> 00:19:16,255
       
  1774 I can now infer that
       
  1775 the empty string is
       
  1776 
       
  1777 387
       
  1778 00:19:16,255 --> 00:19:24,750
       
  1779 in L one and the empty
       
  1780 string is in L two.
       
  1781 
       
  1782 388
       
  1783 00:19:24,750 --> 00:19:28,175
       
  1784 Ok? Not much as I
       
  1785 can do on this side.
       
  1786 
       
  1787 389
       
  1788 00:19:28,175 --> 00:19:31,520
       
  1789 But I can now try
       
  1790 to infer something
       
  1791 
       
  1792 390
       
  1793 00:19:31,520 --> 00:19:33,110
       
  1794 here because I know how
       
  1795 
       
  1796 391
       
  1797 00:19:33,110 --> 00:19:35,735
       
  1798 this L of this
       
  1799 sequence is defined.
       
  1800 
       
  1801 392
       
  1802 00:19:35,735 --> 00:19:39,665
       
  1803 That says the empty
       
  1804 string has to be n l,
       
  1805 
       
  1806 393
       
  1807 00:19:39,665 --> 00:19:45,110
       
  1808 one append of L of R, two.
       
  1809 
       
  1810 394
       
  1811 00:19:45,110 --> 00:19:48,365
       
  1812 Ok? So they are not yet equal.
       
  1813 
       
  1814 395
       
  1815 00:19:48,365 --> 00:19:50,150
       
  1816 But now I have to
       
  1817 essentially scratch
       
  1818 
       
  1819 396
       
  1820 00:19:50,150 --> 00:19:52,430
       
  1821 my head. What does that mean?
       
  1822 
       
  1823 397
       
  1824 00:19:52,430 --> 00:19:54,965
       
  1825 So this is the language
       
  1826 concatenation,
       
  1827 
       
  1828 398
       
  1829 00:19:54,965 --> 00:19:57,035
       
  1830 Venice, the empty string.
       
  1831 
       
  1832 399
       
  1833 00:19:57,035 --> 00:20:01,265
       
  1834 In this concatenation of
       
  1835 these two language as well.
       
  1836 
       
  1837 400
       
  1838 00:20:01,265 --> 00:20:05,990
       
  1839 It will be there if
       
  1840 L of our wormholes
       
  1841 
       
  1842 401
       
  1843 00:20:05,990 --> 00:20:12,560
       
  1844 and The empty string
       
  1845 of L R2 holds.
       
  1846 
       
  1847 402
       
  1848 00:20:12,560 --> 00:20:14,600
       
  1849 And so again, I have shown that
       
  1850 
       
  1851 403
       
  1852 00:20:14,600 --> 00:20:17,780
       
  1853 this is equivalent to that.
       
  1854 
       
  1855 404
       
  1856 00:20:17,780 --> 00:20:20,480
       
  1857 Which means that this must be
       
  1858 
       
  1859 405
       
  1860 00:20:20,480 --> 00:20:24,540
       
  1861 equal to that o holds
       
  1862 if and only if.
       
  1863 
       
  1864 406
       
  1865 00:20:25,600 --> 00:20:28,940
       
  1866 Finally, the star case,
       
  1867 
       
  1868 407
       
  1869 00:20:28,940 --> 00:20:31,235
       
  1870 p of r star.
       
  1871 
       
  1872 408
       
  1873 00:20:31,235 --> 00:20:35,450
       
  1874 Ok? So I know my
       
  1875 induction hypothesis,
       
  1876 
       
  1877 409
       
  1878 00:20:35,450 --> 00:20:41,075
       
  1879 let me write it again on
       
  1880 the top that P of R holds,
       
  1881 
       
  1882 410
       
  1883 00:20:41,075 --> 00:20:46,835
       
  1884 which means nullable of R if and
       
  1885 
       
  1886 411
       
  1887 00:20:46,835 --> 00:20:53,060
       
  1888 only if the empty
       
  1889 string is in L. Okay?
       
  1890 
       
  1891 412
       
  1892 00:20:53,060 --> 00:20:58,580
       
  1893 And I have to prove
       
  1894 that nullable of
       
  1895 
       
  1896 413
       
  1897 00:20:58,580 --> 00:21:08,930
       
  1898 r star if and only if the
       
  1899 empty string is n l of r star.
       
  1900 
       
  1901 414
       
  1902 00:21:08,930 --> 00:21:15,020
       
  1903 Ok? So again, I just follow
       
  1904 how are things define?
       
  1905 
       
  1906 415
       
  1907 00:21:15,020 --> 00:21:18,125
       
  1908 So this is defined
       
  1909 actually as true.
       
  1910 
       
  1911 416
       
  1912 00:21:18,125 --> 00:21:21,365
       
  1913 So a star's always nullable.
       
  1914 
       
  1915 417
       
  1916 00:21:21,365 --> 00:21:23,090
       
  1917 What is this defined where,
       
  1918 
       
  1919 418
       
  1920 00:21:23,090 --> 00:21:26,225
       
  1921 remember that wants to find
       
  1922 using the Kleene star.
       
  1923 
       
  1924 419
       
  1925 00:21:26,225 --> 00:21:28,685
       
  1926 So that is essentially L of
       
  1927 
       
  1928 420
       
  1929 00:21:28,685 --> 00:21:33,680
       
  1930 R. And then we had this
       
  1931 star on languages.
       
  1932 
       
  1933 421
       
  1934 00:21:33,680 --> 00:21:36,260
       
  1935 And does the star on
       
  1936 
       
  1937 422
       
  1938 00:21:36,260 --> 00:21:39,380
       
  1939 languages always contain
       
  1940 the empty string?
       
  1941 
       
  1942 423
       
  1943 00:21:39,380 --> 00:21:42,980
       
  1944 Yes, that is true as well.
       
  1945 
       
  1946 424
       
  1947 00:21:42,980 --> 00:21:45,320
       
  1948 And so again, I've
       
  1949 shown their clone.
       
  1950 
       
  1951 425
       
  1952 00:21:45,320 --> 00:21:47,420
       
  1953 And here you can see, even if I
       
  1954 
       
  1955 426
       
  1956 00:21:47,420 --> 00:21:49,730
       
  1957 have here in this case
       
  1958 an induction hypothesis,
       
  1959 
       
  1960 427
       
  1961 00:21:49,730 --> 00:21:51,860
       
  1962 I actually didn't need
       
  1963 that in my reasoning.
       
  1964 
       
  1965 428
       
  1966 00:21:51,860 --> 00:21:53,960
       
  1967 Well, that might happen.
       
  1968 
       
  1969 429
       
  1970 00:21:53,960 --> 00:21:56,840
       
  1971 This might have been the
       
  1972 first induction proof
       
  1973 
       
  1974 430
       
  1975 00:21:56,840 --> 00:21:58,445
       
  1976 for regular expressions for you.
       
  1977 
       
  1978 431
       
  1979 00:21:58,445 --> 00:22:00,680
       
  1980 But you've seen at
       
  1981 Watson so difficult,
       
  1982 
       
  1983 432
       
  1984 00:22:00,680 --> 00:22:02,480
       
  1985 you just follow this recipe.
       
  1986 
       
  1987 433
       
  1988 00:22:02,480 --> 00:22:04,280
       
  1989 It told us we can
       
  1990 
       
  1991 434
       
  1992 00:22:04,280 --> 00:22:06,725
       
  1993 assume in each case
       
  1994 what we have to prove.
       
  1995 
       
  1996 435
       
  1997 00:22:06,725 --> 00:22:09,860
       
  1998 And then we essentially just
       
  1999 followed what are things to
       
  2000 
       
  2001 436
       
  2002 00:22:09,860 --> 00:22:13,595
       
  2003 find and try to infer
       
  2004 whether in each case,
       
  2005 
       
  2006 437
       
  2007 00:22:13,595 --> 00:22:15,560
       
  2008 both of them produce
       
  2009 the same result,
       
  2010 
       
  2011 438
       
  2012 00:22:15,560 --> 00:22:18,245
       
  2013 the specification and
       
  2014 the implementation.
       
  2015 
       
  2016 439
       
  2017 00:22:18,245 --> 00:22:20,300
       
  2018 And the great thing about this,
       
  2019 
       
  2020 440
       
  2021 00:22:20,300 --> 00:22:22,250
       
  2022 if you've done all these cases,
       
  2023 
       
  2024 441
       
  2025 00:22:22,250 --> 00:22:25,490
       
  2026 we not just know that it holds
       
  2027 for some reg expression,
       
  2028 
       
  2029 442
       
  2030 00:22:25,490 --> 00:22:28,280
       
  2031 but actually it holds for
       
  2032 all wreck expressions be
       
  2033 
       
  2034 443
       
  2035 00:22:28,280 --> 00:22:30,890
       
  2036 couldn't achieve the same
       
  2037 result by looking just
       
  2038 
       
  2039 444
       
  2040 00:22:30,890 --> 00:22:33,710
       
  2041 at testing because
       
  2042 we would not be
       
  2043 
       
  2044 445
       
  2045 00:22:33,710 --> 00:22:37,280
       
  2046 able to test for all
       
  2047 record expressions.
       
  2048 
       
  2049 446
       
  2050 00:22:37,280 --> 00:22:39,560
       
  2051 Ok, of course, there will be
       
  2052 
       
  2053 447
       
  2054 00:22:39,560 --> 00:22:43,010
       
  2055 also induction proofs which
       
  2056 are more complicated.
       
  2057 
       
  2058 448
       
  2059 00:22:43,010 --> 00:22:46,170
       
  2060 But let's leave that
       
  2061 for another time.
       
  2062 
       
  2063 449
       
  2064 00:22:49,210 --> 00:22:53,405
       
  2065 You know, I always like to
       
  2066 talk about the subject.
       
  2067 
       
  2068 450
       
  2069 00:22:53,405 --> 00:22:55,700
       
  2070 So here's one last thought.
       
  2071 
       
  2072 451
       
  2073 00:22:55,700 --> 00:22:58,040
       
  2074 You might think these proofs for
       
  2075 
       
  2076 452
       
  2077 00:22:58,040 --> 00:22:59,930
       
  2078 this little property
       
  2079 about nullable are
       
  2080 
       
  2081 453
       
  2082 00:22:59,930 --> 00:23:04,295
       
  2083 totally overkill and
       
  2084 total waste of your time.
       
  2085 
       
  2086 454
       
  2087 00:23:04,295 --> 00:23:07,310
       
  2088 Please rest assured disproves
       
  2089 
       
  2090 455
       
  2091 00:23:07,310 --> 00:23:10,160
       
  2092 have saved me from so
       
  2093 much embarrassment.
       
  2094 
       
  2095 456
       
  2096 00:23:10,160 --> 00:23:13,745
       
  2097 You cannot imagine,
       
  2098 once I have a proof,
       
  2099 
       
  2100 457
       
  2101 00:23:13,745 --> 00:23:15,425
       
  2102 I can sleep much better.
       
  2103 
       
  2104 458
       
  2105 00:23:15,425 --> 00:23:16,820
       
  2106 If I don't have a proof,
       
  2107 
       
  2108 459
       
  2109 00:23:16,820 --> 00:23:19,560
       
  2110 I don't believe what I'm doing.