videos/02-algo2.srt
changeset 769 f9686b22db7e
child 772 3bf3f5bb067e
equal deleted inserted replaced
768:34f77b976b88 769:f9686b22db7e
       
     1 1
       
     2 00:00:06,020 --> 00:00:09,945
       
     3 They come back after
       
     4 this disappointment
       
     5 
       
     6 2
       
     7 00:00:09,945 --> 00:00:14,115
       
     8 and case of over promising
       
     9 and under achieving.
       
    10 
       
    11 3
       
    12 00:00:14,115 --> 00:00:17,295
       
    13 Let's have a look whether
       
    14 we can recover from that.
       
    15 
       
    16 4
       
    17 00:00:17,295 --> 00:00:20,535
       
    18 So here's one problem.
       
    19 
       
    20 5
       
    21 00:00:20,535 --> 00:00:23,790
       
    22 Then we looked at this end
       
    23 times vk expression and
       
    24 
       
    25 6
       
    26 00:00:23,790 --> 00:00:27,330
       
    27 be able to represent
       
    28 that directly.
       
    29 
       
    30 7
       
    31 00:00:27,330 --> 00:00:31,275
       
    32 We had two represented as a
       
    33 sequence record expression.
       
    34 
       
    35 8
       
    36 00:00:31,275 --> 00:00:32,850
       
    37 So we expanded it.
       
    38 
       
    39 9
       
    40 00:00:32,850 --> 00:00:36,510
       
    41 So times one would be just a,
       
    42 
       
    43 10
       
    44 00:00:36,510 --> 00:00:40,595
       
    45 t, times two would
       
    46 be a, and so on.
       
    47 
       
    48 11
       
    49 00:00:40,595 --> 00:00:43,535
       
    50 And so you can see if
       
    51 you go already to 13,
       
    52 
       
    53 12
       
    54 00:00:43,535 --> 00:00:47,960
       
    55 N equals 13, the record
       
    56 expression becomes quite large.
       
    57 
       
    58 13
       
    59 00:00:47,960 --> 00:00:52,865
       
    60 And now the functions
       
    61 nullable and also derivative.
       
    62 
       
    63 14
       
    64 00:00:52,865 --> 00:00:56,360
       
    65 They need to traverse
       
    66 this reg expression.
       
    67 
       
    68 15
       
    69 00:00:56,360 --> 00:00:59,060
       
    70 Remember this tree in sometimes
       
    71 
       
    72 16
       
    73 00:00:59,060 --> 00:01:01,820
       
    74 they have to traverse
       
    75 that even several times.
       
    76 
       
    77 17
       
    78 00:01:01,820 --> 00:01:04,535
       
    79 So that will slow
       
    80 down the algorithm.
       
    81 
       
    82 18
       
    83 00:01:04,535 --> 00:01:08,150
       
    84 And in particular, because
       
    85 our first reg expression was
       
    86 
       
    87 19
       
    88 00:01:08,150 --> 00:01:11,840
       
    89 actually not just a bot
       
    90 a plus one. So b hat.
       
    91 
       
    92 20
       
    93 00:01:11,840 --> 00:01:14,330
       
    94 In the case of 13,
       
    95 we had a plus one,
       
    96 
       
    97 21
       
    98 00:01:14,330 --> 00:01:17,330
       
    99 a plus one a plus born 13 times.
       
   100 
       
   101 22
       
   102 00:01:17,330 --> 00:01:20,150
       
   103 And this reg
       
   104 expression just grows,
       
   105 
       
   106 23
       
   107 00:01:20,150 --> 00:01:25,475
       
   108 stay in number n. Just to
       
   109 show you this also encode,
       
   110 
       
   111 24
       
   112 00:01:25,475 --> 00:01:28,115
       
   113 I'm Stern, This File rewarm
       
   114 
       
   115 25
       
   116 00:01:28,115 --> 00:01:30,920
       
   117 and D and I have a size function.
       
   118 
       
   119 26
       
   120 00:01:30,920 --> 00:01:33,140
       
   121 The size function takes
       
   122 a regular expression as
       
   123 
       
   124 27
       
   125 00:01:33,140 --> 00:01:36,215
       
   126 argument and produces in teacher.
       
   127 
       
   128 28
       
   129 00:01:36,215 --> 00:01:39,980
       
   130 And essentially it takes
       
   131 this rec expression scene as
       
   132 
       
   133 29
       
   134 00:01:39,980 --> 00:01:44,045
       
   135 a tree and count how many
       
   136 nodes are in this tree.
       
   137 
       
   138 30
       
   139 00:01:44,045 --> 00:01:49,490
       
   140 And so if I take this even
       
   141 one reg expression, this one,
       
   142 
       
   143 31
       
   144 00:01:49,490 --> 00:01:52,160
       
   145 and increase now this n. So you
       
   146 
       
   147 32
       
   148 00:01:52,160 --> 00:01:54,680
       
   149 can already see
       
   150 for any cross one,
       
   151 
       
   152 33
       
   153 00:01:54,680 --> 00:01:56,540
       
   154 the size of this
       
   155 record expression is
       
   156 
       
   157 34
       
   158 00:01:56,540 --> 00:01:59,615
       
   159 five and you see it
       
   160 slowly increases.
       
   161 
       
   162 35
       
   163 00:01:59,615 --> 00:02:02,150
       
   164 And this 20 n equals 20.
       
   165 
       
   166 36
       
   167 00:02:02,150 --> 00:02:07,100
       
   168 The size of this record
       
   169 expression is SMOW already a 119.
       
   170 
       
   171 37
       
   172 00:02:07,100 --> 00:02:10,145
       
   173 So now the interesting
       
   174 line as this one.
       
   175 
       
   176 38
       
   177 00:02:10,145 --> 00:02:13,295
       
   178 Remember it when we
       
   179 built the derivative,
       
   180 
       
   181 39
       
   182 00:02:13,295 --> 00:02:17,150
       
   183 very often parts of a reg
       
   184 expression gets copied.
       
   185 
       
   186 40
       
   187 00:02:17,150 --> 00:02:19,280
       
   188 So if you have like EVA,
       
   189 
       
   190 41
       
   191 00:02:19,280 --> 00:02:22,325
       
   192 one of 2019 nodes,
       
   193 
       
   194 42
       
   195 00:02:22,325 --> 00:02:25,475
       
   196 now this will be often copied.
       
   197 
       
   198 43
       
   199 00:02:25,475 --> 00:02:31,475
       
   200 And we want to see what's the
       
   201 resulting tree look like,
       
   202 
       
   203 44
       
   204 00:02:31,475 --> 00:02:32,780
       
   205 how many nodes it has.
       
   206 
       
   207 45
       
   208 00:02:32,780 --> 00:02:34,985
       
   209 So I take here either one of 20
       
   210 
       
   211 46
       
   212 00:02:34,985 --> 00:02:38,405
       
   213 and the derivative
       
   214 according to 20 a's.
       
   215 
       
   216 47
       
   217 00:02:38,405 --> 00:02:41,820
       
   218 And just have a look
       
   219 what the size is.
       
   220 
       
   221 48
       
   222 00:02:43,210 --> 00:02:45,680
       
   223 Okay, that takes away.
       
   224 
       
   225 49
       
   226 00:02:45,680 --> 00:02:48,410
       
   227 You see now this rec expression,
       
   228 
       
   229 50
       
   230 00:02:48,410 --> 00:02:52,280
       
   231 the derivative has already
       
   232 8 million plus nodes.
       
   233 
       
   234 51
       
   235 00:02:52,280 --> 00:02:55,400
       
   236 And now nullable and
       
   237 derivative have to
       
   238 
       
   239 52
       
   240 00:02:55,400 --> 00:02:58,430
       
   241 traverse these trees with
       
   242 a million plus nodes.
       
   243 
       
   244 53
       
   245 00:02:58,430 --> 00:03:01,400
       
   246 So it's no wonder
       
   247 that this is slow.
       
   248 
       
   249 54
       
   250 00:03:01,400 --> 00:03:03,860
       
   251 The first thing we
       
   252 can try to mitigate
       
   253 
       
   254 55
       
   255 00:03:03,860 --> 00:03:06,350
       
   256 this explosion problem is to
       
   257 
       
   258 56
       
   259 00:03:06,350 --> 00:03:09,050
       
   260 have an explicit and
       
   261 times reg expression.
       
   262 
       
   263 57
       
   264 00:03:09,050 --> 00:03:11,600
       
   265 So instead of expanding
       
   266 
       
   267 58
       
   268 00:03:11,600 --> 00:03:13,415
       
   269 it to the sequence
       
   270 reg expression,
       
   271 
       
   272 59
       
   273 00:03:13,415 --> 00:03:17,510
       
   274 let's just have a single
       
   275 wreck expression and times,
       
   276 
       
   277 60
       
   278 00:03:17,510 --> 00:03:20,540
       
   279 which takes an expression and
       
   280 
       
   281 61
       
   282 00:03:20,540 --> 00:03:24,470
       
   283 a number and keeps that
       
   284 regular expression Small.
       
   285 
       
   286 62
       
   287 00:03:24,470 --> 00:03:27,095
       
   288 I'm here in the fire V2,
       
   289 
       
   290 63
       
   291 00:03:27,095 --> 00:03:29,090
       
   292 which is also on Keats.
       
   293 
       
   294 64
       
   295 00:03:29,090 --> 00:03:32,570
       
   296 And the only change I made
       
   297 is I added explicitly
       
   298 
       
   299 65
       
   300 00:03:32,570 --> 00:03:33,860
       
   301 a wrecker expression for
       
   302 
       
   303 66
       
   304 00:03:33,860 --> 00:03:36,770
       
   305 N times the optional
       
   306 reg expression.
       
   307 
       
   308 67
       
   309 00:03:36,770 --> 00:03:39,920
       
   310 Very leaf out at the
       
   311 moment because this a
       
   312 
       
   313 68
       
   314 00:03:39,920 --> 00:03:41,975
       
   315 optional is represented as
       
   316 
       
   317 69
       
   318 00:03:41,975 --> 00:03:44,645
       
   319 a plus one and doesn't
       
   320 explain too much.
       
   321 
       
   322 70
       
   323 00:03:44,645 --> 00:03:47,375
       
   324 The really the culprit
       
   325 is this n times.
       
   326 
       
   327 71
       
   328 00:03:47,375 --> 00:03:51,095
       
   329 And instead of expanding
       
   330 it n times, which has,
       
   331 
       
   332 72
       
   333 00:03:51,095 --> 00:03:54,275
       
   334 take here a wreck expression
       
   335 and a natural number,
       
   336 
       
   337 73
       
   338 00:03:54,275 --> 00:03:56,960
       
   339 which says how many times
       
   340 it should be repeated.
       
   341 
       
   342 74
       
   343 00:03:56,960 --> 00:03:59,165
       
   344 And in this week we
       
   345 can keep it small.
       
   346 
       
   347 75
       
   348 00:03:59,165 --> 00:04:01,370
       
   349 But by adding that
       
   350 reg expression,
       
   351 
       
   352 76
       
   353 00:04:01,370 --> 00:04:05,150
       
   354 we now have to think about
       
   355 cases for nullable and
       
   356 
       
   357 77
       
   358 00:04:05,150 --> 00:04:07,340
       
   359 derivative so that we have
       
   360 
       
   361 78
       
   362 00:04:07,340 --> 00:04:10,205
       
   363 to now calculate next
       
   364 what they look like.
       
   365 
       
   366 79
       
   367 00:04:10,205 --> 00:04:14,060
       
   368 We can actually
       
   369 calculate what the rule
       
   370 
       
   371 80
       
   372 00:04:14,060 --> 00:04:17,525
       
   373 for nullable should be from
       
   374 how we defined it earlier.
       
   375 
       
   376 81
       
   377 00:04:17,525 --> 00:04:19,580
       
   378 Remember, if you have
       
   379 a regular expression,
       
   380 
       
   381 82
       
   382 00:04:19,580 --> 00:04:21,980
       
   383 R and B take in
       
   384 times of this are,
       
   385 
       
   386 83
       
   387 00:04:21,980 --> 00:04:25,220
       
   388 then indicates if n equals 0,
       
   389 
       
   390 84
       
   391 00:04:25,220 --> 00:04:28,130
       
   392 we find that as the
       
   393 one reg expression.
       
   394 
       
   395 85
       
   396 00:04:28,130 --> 00:04:30,380
       
   397 If n equals one,
       
   398 
       
   399 86
       
   400 00:04:30,380 --> 00:04:32,495
       
   401 it will be just a
       
   402 single copy of on.
       
   403 
       
   404 87
       
   405 00:04:32,495 --> 00:04:33,725
       
   406 If n equals two,
       
   407 
       
   408 88
       
   409 00:04:33,725 --> 00:04:35,270
       
   410 it will be two copies.
       
   411 
       
   412 89
       
   413 00:04:35,270 --> 00:04:39,260
       
   414 Sequence if three, we have
       
   415 three copies and so on.
       
   416 
       
   417 90
       
   418 00:04:39,260 --> 00:04:41,390
       
   419 Now we have to
       
   420 somehow characterize
       
   421 
       
   422 91
       
   423 00:04:41,390 --> 00:04:44,285
       
   424 when these cases all nullable.
       
   425 
       
   426 92
       
   427 00:04:44,285 --> 00:04:47,810
       
   428 Well, if n equals 0,
       
   429 
       
   430 93
       
   431 00:04:47,810 --> 00:04:49,850
       
   432 then this will be defined as one,
       
   433 
       
   434 94
       
   435 00:04:49,850 --> 00:04:52,100
       
   436 so one can match
       
   437 the empty string.
       
   438 
       
   439 95
       
   440 00:04:52,100 --> 00:04:54,785
       
   441 So that should be
       
   442 definitely true.
       
   443 
       
   444 96
       
   445 00:04:54,785 --> 00:04:57,725
       
   446 Okay, if any cross one,
       
   447 
       
   448 97
       
   449 00:04:57,725 --> 00:05:00,470
       
   450 when can this reg expression
       
   451 match the empty string?
       
   452 
       
   453 98
       
   454 00:05:00,470 --> 00:05:02,000
       
   455 So vent should be nullable.
       
   456 
       
   457 99
       
   458 00:05:02,000 --> 00:05:07,535
       
   459 Well, if nullable of our holes,
       
   460 
       
   461 100
       
   462 00:05:07,535 --> 00:05:09,860
       
   463 okay, so if I can match
       
   464 the empty string,
       
   465 
       
   466 101
       
   467 00:05:09,860 --> 00:05:12,110
       
   468 then we can match
       
   469 the empty string.
       
   470 
       
   471 102
       
   472 00:05:12,110 --> 00:05:14,870
       
   473 When can this regular expression
       
   474 match the empty string?
       
   475 
       
   476 103
       
   477 00:05:14,870 --> 00:05:16,265
       
   478 So these are now two copies of
       
   479 
       
   480 104
       
   481 00:05:16,265 --> 00:05:20,690
       
   482 our well-posed copies need
       
   483 to match the empty string.
       
   484 
       
   485 105
       
   486 00:05:20,690 --> 00:05:22,820
       
   487 So again, we have to ask
       
   488 
       
   489 106
       
   490 00:05:22,820 --> 00:05:25,774
       
   491 both of them need to be nullable.
       
   492 
       
   493 107
       
   494 00:05:25,774 --> 00:05:28,520
       
   495 Ok? Similarly, in the third case,
       
   496 
       
   497 108
       
   498 00:05:28,520 --> 00:05:30,710
       
   499 all three copies need to be
       
   500 
       
   501 109
       
   502 00:05:30,710 --> 00:05:33,230
       
   503 able to match the empty
       
   504 string. Men, is that the case?
       
   505 
       
   506 110
       
   507 00:05:33,230 --> 00:05:38,360
       
   508 Well, if nullable of
       
   509 our holes and so on.
       
   510 
       
   511 111
       
   512 00:05:38,360 --> 00:05:48,754
       
   513 So we can actually define that
       
   514 if n equals 0, then true.
       
   515 
       
   516 112
       
   517 00:05:48,754 --> 00:05:56,045
       
   518 Else we have to ask with
       
   519 nullable of our holes.
       
   520 
       
   521 113
       
   522 00:05:56,045 --> 00:05:58,550
       
   523 So that will be the clause we
       
   524 
       
   525 114
       
   526 00:05:58,550 --> 00:06:01,625
       
   527 have to implement for
       
   528 n times and nullable.
       
   529 
       
   530 115
       
   531 00:06:01,625 --> 00:06:04,280
       
   532 Deriving the definition for
       
   533 
       
   534 116
       
   535 00:06:04,280 --> 00:06:06,920
       
   536 the derivative of the n terms
       
   537 
       
   538 117
       
   539 00:06:06,920 --> 00:06:10,175
       
   540 reg expression.
       
   541 It's not that easy.
       
   542 
       
   543 118
       
   544 00:06:10,175 --> 00:06:12,755
       
   545 So we have to look again
       
   546 how it was defined
       
   547 
       
   548 119
       
   549 00:06:12,755 --> 00:06:16,010
       
   550 earlier and we somehow
       
   551 have to spot a pattern.
       
   552 
       
   553 120
       
   554 00:06:16,010 --> 00:06:18,380
       
   555 The voice, our
       
   556 algorithm will be again
       
   557 
       
   558 121
       
   559 00:06:18,380 --> 00:06:20,975
       
   560 quite slow if you don't
       
   561 spot that pattern.
       
   562 
       
   563 122
       
   564 00:06:20,975 --> 00:06:22,550
       
   565 So let's have a look.
       
   566 
       
   567 123
       
   568 00:06:22,550 --> 00:06:26,240
       
   569 So in the case that
       
   570 n is equal to 0,
       
   571 
       
   572 124
       
   573 00:06:26,240 --> 00:06:29,885
       
   574 then r n times most
       
   575 defined as just one.
       
   576 
       
   577 125
       
   578 00:06:29,885 --> 00:06:32,030
       
   579 What would be the
       
   580 derivative of one?
       
   581 
       
   582 126
       
   583 00:06:32,030 --> 00:06:36,140
       
   584 So the derivative of c of one.
       
   585 
       
   586 127
       
   587 00:06:36,140 --> 00:06:38,990
       
   588 Peaches defined as 0.
       
   589 
       
   590 128
       
   591 00:06:38,990 --> 00:06:41,465
       
   592 Okay, fair enough.
       
   593 
       
   594 129
       
   595 00:06:41,465 --> 00:06:44,960
       
   596 If he have any cross one,
       
   597 
       
   598 130
       
   599 00:06:44,960 --> 00:06:48,125
       
   600 then we just have one copy
       
   601 of this regular expression.
       
   602 
       
   603 131
       
   604 00:06:48,125 --> 00:06:50,120
       
   605 So there's not much as we can do.
       
   606 
       
   607 132
       
   608 00:06:50,120 --> 00:06:53,735
       
   609 We would have to calculate
       
   610 the derivative of air are.
       
   611 
       
   612 133
       
   613 00:06:53,735 --> 00:06:57,395
       
   614 Now in the n equals two case.
       
   615 
       
   616 134
       
   617 00:06:57,395 --> 00:07:00,245
       
   618 That would mean we
       
   619 have two copies of
       
   620 
       
   621 135
       
   622 00:07:00,245 --> 00:07:03,425
       
   623 R. And they would be a sequence.
       
   624 
       
   625 136
       
   626 00:07:03,425 --> 00:07:07,775
       
   627 How would be the derivative C of
       
   628 
       
   629 137
       
   630 00:07:07,775 --> 00:07:10,340
       
   631 R four by R be
       
   632 
       
   633 138
       
   634 00:07:10,340 --> 00:07:13,265
       
   635 defined where we would
       
   636 look up the definition.
       
   637 
       
   638 139
       
   639 00:07:13,265 --> 00:07:15,470
       
   640 And it would say that's
       
   641 the complicated case
       
   642 
       
   643 140
       
   644 00:07:15,470 --> 00:07:16,760
       
   645 d sequence or
       
   646 
       
   647 141
       
   648 00:07:16,760 --> 00:07:21,875
       
   649 if null a bowl of R,
       
   650 
       
   651 142
       
   652 00:07:21,875 --> 00:07:25,250
       
   653 Then the most complicated case.
       
   654 
       
   655 143
       
   656 00:07:25,250 --> 00:07:28,225
       
   657 Else, That's the easy
       
   658 case that would be
       
   659 
       
   660 144
       
   661 00:07:28,225 --> 00:07:32,660
       
   662 derivative of c of R four
       
   663 
       
   664 145
       
   665 00:07:32,660 --> 00:07:35,540
       
   666 by R. And then I
       
   667 just have to copy
       
   668 
       
   669 146
       
   670 00:07:35,540 --> 00:07:40,775
       
   671 that derivative of C
       
   672 of four by r here.
       
   673 
       
   674 147
       
   675 00:07:40,775 --> 00:07:43,130
       
   676 But in this case I have also
       
   677 
       
   678 148
       
   679 00:07:43,130 --> 00:07:51,145
       
   680 the alternative derivative
       
   681 of c of r. And from that,
       
   682 
       
   683 149
       
   684 00:07:51,145 --> 00:07:55,030
       
   685 it looks like we can
       
   686 do much better than
       
   687 
       
   688 150
       
   689 00:07:55,030 --> 00:07:58,390
       
   690 that unless we do
       
   691 
       
   692 151
       
   693 00:07:58,390 --> 00:08:02,560
       
   694 a little bit of magic and the
       
   695 magic to do with this case.
       
   696 
       
   697 152
       
   698 00:08:02,560 --> 00:08:07,420
       
   699 So let me look at this
       
   700 case a bit more closely.
       
   701 
       
   702 153
       
   703 00:08:07,420 --> 00:08:09,819
       
   704 Let me go back to slides
       
   705 
       
   706 154
       
   707 00:08:09,819 --> 00:08:12,940
       
   708 because actually the calculation
       
   709 might be a bit hairy.
       
   710 
       
   711 155
       
   712 00:08:12,940 --> 00:08:16,945
       
   713 So remember we are in this
       
   714 case where n equals two.
       
   715 
       
   716 156
       
   717 00:08:16,945 --> 00:08:18,550
       
   718 And this was defined as
       
   719 
       
   720 157
       
   721 00:08:18,550 --> 00:08:20,680
       
   722 this sequence work
       
   723 expression R followed
       
   724 
       
   725 158
       
   726 00:08:20,680 --> 00:08:23,080
       
   727 by r. And the question was,
       
   728 
       
   729 159
       
   730 00:08:23,080 --> 00:08:26,365
       
   731 can we somehow make sense
       
   732 out of this derivative
       
   733 
       
   734 160
       
   735 00:08:26,365 --> 00:08:30,370
       
   736 where we have to build the
       
   737 derivative of a sequence.
       
   738 
       
   739 161
       
   740 00:08:30,370 --> 00:08:33,020
       
   741 So features the definition.
       
   742 
       
   743 162
       
   744 00:08:33,020 --> 00:08:36,050
       
   745 We would ask if
       
   746 the r is nullable,
       
   747 
       
   748 163
       
   749 00:08:36,050 --> 00:08:39,095
       
   750 In this case, we return
       
   751 this alternative.
       
   752 
       
   753 164
       
   754 00:08:39,095 --> 00:08:40,640
       
   755 And if it's not nullable,
       
   756 
       
   757 165
       
   758 00:08:40,640 --> 00:08:44,135
       
   759 It is just this
       
   760 record expression.
       
   761 
       
   762 166
       
   763 00:08:44,135 --> 00:08:49,550
       
   764 Now my claim is that
       
   765 this whole if condition
       
   766 
       
   767 167
       
   768 00:08:49,550 --> 00:08:55,895
       
   769 can be replaced by just this
       
   770 simple derivative here.
       
   771 
       
   772 168
       
   773 00:08:55,895 --> 00:08:58,775
       
   774 And if that claim is correct,
       
   775 
       
   776 169
       
   777 00:08:58,775 --> 00:09:01,520
       
   778 there would be very nice
       
   779 because in contrast to
       
   780 
       
   781 170
       
   782 00:09:01,520 --> 00:09:04,130
       
   783 this if condition where
       
   784 
       
   785 171
       
   786 00:09:04,130 --> 00:09:06,280
       
   787 we have to calculate
       
   788 like nullable,
       
   789 
       
   790 172
       
   791 00:09:06,280 --> 00:09:08,330
       
   792 decide in which branch we are.
       
   793 
       
   794 173
       
   795 00:09:08,330 --> 00:09:10,580
       
   796 We don't have to
       
   797 calculate your now,
       
   798 
       
   799 174
       
   800 00:09:10,580 --> 00:09:13,880
       
   801 but we can just replace
       
   802 it by this expression.
       
   803 
       
   804 175
       
   805 00:09:13,880 --> 00:09:16,670
       
   806 So if we can
       
   807 substantiate that claim,
       
   808 
       
   809 176
       
   810 00:09:16,670 --> 00:09:19,860
       
   811 that will be definitely
       
   812 good file algorithm.
       
   813 
       
   814 177
       
   815 00:09:20,140 --> 00:09:22,775
       
   816 And to substantiate that,
       
   817 
       
   818 178
       
   819 00:09:22,775 --> 00:09:26,795
       
   820 I will focus on this
       
   821 record expression here.
       
   822 
       
   823 179
       
   824 00:09:26,795 --> 00:09:31,100
       
   825 And notice that this record
       
   826 expression the only be
       
   827 
       
   828 180
       
   829 00:09:31,100 --> 00:09:35,780
       
   830 called or only be generated
       
   831 if r is nullable.
       
   832 
       
   833 181
       
   834 00:09:35,780 --> 00:09:38,075
       
   835 So in any other case,
       
   836 
       
   837 182
       
   838 00:09:38,075 --> 00:09:40,060
       
   839 I will actually not go into it
       
   840 
       
   841 183
       
   842 00:09:40,060 --> 00:09:43,850
       
   843 that if branch and would
       
   844 be in the other one.
       
   845 
       
   846 184
       
   847 00:09:43,850 --> 00:09:45,260
       
   848 So if we are in this if branch,
       
   849 
       
   850 185
       
   851 00:09:45,260 --> 00:09:47,705
       
   852 we definitely know
       
   853 that R is nullable.
       
   854 
       
   855 186
       
   856 00:09:47,705 --> 00:09:52,955
       
   857 Okay? Okay, so here's
       
   858 our regular expression.
       
   859 
       
   860 187
       
   861 00:09:52,955 --> 00:09:55,940
       
   862 And we know it's nullable.
       
   863 
       
   864 188
       
   865 00:09:55,940 --> 00:09:57,920
       
   866 So we have to somehow find
       
   867 
       
   868 189
       
   869 00:09:57,920 --> 00:10:00,380
       
   870 an equivalent expression that
       
   871 
       
   872 190
       
   873 00:10:00,380 --> 00:10:04,100
       
   874 is smaller or simpler
       
   875 than that one.
       
   876 
       
   877 191
       
   878 00:10:04,100 --> 00:10:05,945
       
   879 Let's see what we can do.
       
   880 
       
   881 192
       
   882 00:10:05,945 --> 00:10:10,160
       
   883 So the first thing
       
   884 actually is we multiplying
       
   885 
       
   886 193
       
   887 00:10:10,160 --> 00:10:16,595
       
   888 this right hand side of the
       
   889 alternative is times one.
       
   890 
       
   891 194
       
   892 00:10:16,595 --> 00:10:19,700
       
   893 We can do that
       
   894 because this does not
       
   895 
       
   896 195
       
   897 00:10:19,700 --> 00:10:23,090
       
   898 change which strings this
       
   899 work expression can match.
       
   900 
       
   901 196
       
   902 00:10:23,090 --> 00:10:25,685
       
   903 Remember we even had it
       
   904 as a simplification row,
       
   905 
       
   906 197
       
   907 00:10:25,685 --> 00:10:27,425
       
   908 just in this case B,
       
   909 
       
   910 198
       
   911 00:10:27,425 --> 00:10:29,705
       
   912 don't apply it as a
       
   913 simplification will
       
   914 
       
   915 199
       
   916 00:10:29,705 --> 00:10:31,310
       
   917 actually make this
       
   918 work expression
       
   919 
       
   920 200
       
   921 00:10:31,310 --> 00:10:32,720
       
   922 a bit more complicated.
       
   923 
       
   924 201
       
   925 00:10:32,720 --> 00:10:34,910
       
   926 But times one doesn't make
       
   927 
       
   928 202
       
   929 00:10:34,910 --> 00:10:37,820
       
   930 a difference because it
       
   931 means the end of the string,
       
   932 
       
   933 203
       
   934 00:10:37,820 --> 00:10:40,070
       
   935 we still want to match
       
   936 the empty string.
       
   937 
       
   938 204
       
   939 00:10:40,070 --> 00:10:42,155
       
   940 Okay, so that is possible.
       
   941 
       
   942 205
       
   943 00:10:42,155 --> 00:10:45,740
       
   944 I can do that. Once
       
   945 we have done that,
       
   946 
       
   947 206
       
   948 00:10:45,740 --> 00:10:48,410
       
   949 you will notice that this
       
   950 factor derivative of
       
   951 
       
   952 207
       
   953 00:10:48,410 --> 00:10:51,860
       
   954 stuff are exactly the
       
   955 same as that one.
       
   956 
       
   957 208
       
   958 00:10:51,860 --> 00:10:54,650
       
   959 And in between is a plus.
       
   960 
       
   961 209
       
   962 00:10:54,650 --> 00:10:57,440
       
   963 So you probably remember the law
       
   964 
       
   965 210
       
   966 00:10:57,440 --> 00:11:00,170
       
   967 from school math
       
   968 that I can pull out
       
   969 
       
   970 211
       
   971 00:11:00,170 --> 00:11:02,735
       
   972 this factor derivative of c of r.
       
   973 
       
   974 212
       
   975 00:11:02,735 --> 00:11:06,320
       
   976 And I'm inside the parentheses
       
   977 is left with that.
       
   978 
       
   979 213
       
   980 00:11:06,320 --> 00:11:09,245
       
   981 So now I have r plus one.
       
   982 
       
   983 214
       
   984 00:11:09,245 --> 00:11:13,055
       
   985 Usually we cannot
       
   986 simplify r plus one.
       
   987 
       
   988 215
       
   989 00:11:13,055 --> 00:11:15,530
       
   990 If it had been R
       
   991 plus 0, then yes,
       
   992 
       
   993 216
       
   994 00:11:15,530 --> 00:11:18,665
       
   995 we could have got rid of the CRO.
       
   996 
       
   997 217
       
   998 00:11:18,665 --> 00:11:21,590
       
   999 Plus one often
       
  1000 makes a difference,
       
  1001 
       
  1002 218
       
  1003 00:11:21,590 --> 00:11:22,970
       
  1004 but not in our case.
       
  1005 
       
  1006 219
       
  1007 00:11:22,970 --> 00:11:25,940
       
  1008 Remember, we know that
       
  1009 this R is nullable,
       
  1010 
       
  1011 220
       
  1012 00:11:25,940 --> 00:11:29,840
       
  1013 so on its own can already
       
  1014 match the empty string.
       
  1015 
       
  1016 221
       
  1017 00:11:29,840 --> 00:11:33,305
       
  1018 So we don't really need this
       
  1019 alternative plus one there,
       
  1020 
       
  1021 222
       
  1022 00:11:33,305 --> 00:11:35,300
       
  1023 so we can just get rid of that.
       
  1024 
       
  1025 223
       
  1026 00:11:35,300 --> 00:11:37,070
       
  1027 Okay, and so now we have
       
  1028 
       
  1029 224
       
  1030 00:11:37,070 --> 00:11:40,535
       
  1031 a much simpler wound
       
  1032 reg expression.
       
  1033 
       
  1034 225
       
  1035 00:11:40,535 --> 00:11:44,285
       
  1036 And this actually helps a
       
  1037 lot for our if condition.
       
  1038 
       
  1039 226
       
  1040 00:11:44,285 --> 00:11:46,925
       
  1041 Look, this was the
       
  1042 original if condition
       
  1043 
       
  1044 227
       
  1045 00:11:46,925 --> 00:11:50,270
       
  1046 and this is direct expression
       
  1047 h. We just simplified.
       
  1048 
       
  1049 228
       
  1050 00:11:50,270 --> 00:11:53,105
       
  1051 If we replace it with this one,
       
  1052 
       
  1053 229
       
  1054 00:11:53,105 --> 00:11:56,090
       
  1055 then we just end up with this.
       
  1056 
       
  1057 230
       
  1058 00:11:56,090 --> 00:11:59,510
       
  1059 And now you will see that
       
  1060 the if condition is actually
       
  1061 
       
  1062 231
       
  1063 00:11:59,510 --> 00:12:02,750
       
  1064 pointless because you
       
  1065 check if it's null above,
       
  1066 
       
  1067 232
       
  1068 00:12:02,750 --> 00:12:05,060
       
  1069 we return this reg
       
  1070 expression or it's
       
  1071 
       
  1072 233
       
  1073 00:12:05,060 --> 00:12:08,210
       
  1074 not nullable and we
       
  1075 return exactly the same.
       
  1076 
       
  1077 234
       
  1078 00:12:08,210 --> 00:12:10,025
       
  1079 That doesn't make any difference.
       
  1080 
       
  1081 235
       
  1082 00:12:10,025 --> 00:12:11,750
       
  1083 So we can just get rid of
       
  1084 
       
  1085 236
       
  1086 00:12:11,750 --> 00:12:14,645
       
  1087 that one and can
       
  1088 replace it by that.
       
  1089 
       
  1090 237
       
  1091 00:12:14,645 --> 00:12:16,865
       
  1092 And you see, this is now
       
  1093 
       
  1094 238
       
  1095 00:12:16,865 --> 00:12:20,720
       
  1096 a much simpler case than
       
  1097 what we had before.
       
  1098 
       
  1099 239
       
  1100 00:12:20,720 --> 00:12:24,170
       
  1101 So let's take stock
       
  1102 what we have so far.
       
  1103 
       
  1104 240
       
  1105 00:12:24,170 --> 00:12:26,915
       
  1106 So we know India CEO case,
       
  1107 
       
  1108 241
       
  1109 00:12:26,915 --> 00:12:30,920
       
  1110 the derivative needs
       
  1111 to be defined as 0.
       
  1112 
       
  1113 242
       
  1114 00:12:30,920 --> 00:12:33,095
       
  1115 So because we define this
       
  1116 
       
  1117 243
       
  1118 00:12:33,095 --> 00:12:36,785
       
  1119 and times as one,
       
  1120 the derivative is 0.
       
  1121 
       
  1122 244
       
  1123 00:12:36,785 --> 00:12:39,889
       
  1124 For chest r, this will
       
  1125 be the derivative.
       
  1126 
       
  1127 245
       
  1128 00:12:39,889 --> 00:12:42,170
       
  1129 And we can't do any
       
  1130 better than that
       
  1131 
       
  1132 246
       
  1133 00:12:42,170 --> 00:12:45,620
       
  1134 for our followed by
       
  1135 RB just found out.
       
  1136 
       
  1137 247
       
  1138 00:12:45,620 --> 00:12:47,270
       
  1139 Actually it is quite simple.
       
  1140 
       
  1141 248
       
  1142 00:12:47,270 --> 00:12:51,410
       
  1143 Reg expression is equivalent
       
  1144 to the derivative.
       
  1145 
       
  1146 249
       
  1147 00:12:51,410 --> 00:12:53,870
       
  1148 Now, we have to continue with
       
  1149 
       
  1150 250
       
  1151 00:12:53,870 --> 00:12:56,090
       
  1152 that case where n is
       
  1153 equal to three and we
       
  1154 
       
  1155 251
       
  1156 00:12:56,090 --> 00:12:58,099
       
  1157 now have three copies
       
  1158 
       
  1159 252
       
  1160 00:12:58,099 --> 00:13:02,390
       
  1161 of this or what should
       
  1162 be the derivative?
       
  1163 
       
  1164 253
       
  1165 00:13:02,390 --> 00:13:05,330
       
  1166 Well, if you look very carefully
       
  1167 
       
  1168 254
       
  1169 00:13:05,330 --> 00:13:08,465
       
  1170 at this emerging pattern,
       
  1171 
       
  1172 255
       
  1173 00:13:08,465 --> 00:13:12,410
       
  1174 I have to say then
       
  1175 what would be nice if,
       
  1176 
       
  1177 256
       
  1178 00:13:12,410 --> 00:13:16,400
       
  1179 if he could show that in
       
  1180 the n equals three case,
       
  1181 
       
  1182 257
       
  1183 00:13:16,400 --> 00:13:18,275
       
  1184 we end up with this.
       
  1185 
       
  1186 258
       
  1187 00:13:18,275 --> 00:13:21,290
       
  1188 Because then what we have is.
       
  1189 
       
  1190 259
       
  1191 00:13:21,290 --> 00:13:25,370
       
  1192 We can define our
       
  1193 nullable for n times
       
  1194 
       
  1195 260
       
  1196 00:13:25,370 --> 00:13:31,025
       
  1197 s. If any cross 0 then
       
  1198 true as nullable.
       
  1199 
       
  1200 261
       
  1201 00:13:31,025 --> 00:13:33,875
       
  1202 And for the derivative,
       
  1203 
       
  1204 262
       
  1205 00:13:33,875 --> 00:13:37,190
       
  1206 we can save if n is equal to 0,
       
  1207 
       
  1208 263
       
  1209 00:13:37,190 --> 00:13:40,235
       
  1210 then we return the
       
  1211 Sierra reg expression.
       
  1212 
       
  1213 264
       
  1214 00:13:40,235 --> 00:13:43,295
       
  1215 Otherwise, as you can see
       
  1216 from this pattern here,
       
  1217 
       
  1218 265
       
  1219 00:13:43,295 --> 00:13:50,735
       
  1220 it would be derivative of
       
  1221 c r four by r n minus one.
       
  1222 
       
  1223 266
       
  1224 00:13:50,735 --> 00:13:54,770
       
  1225 Okay? And the only
       
  1226 
       
  1227 267
       
  1228 00:13:54,770 --> 00:13:56,330
       
  1229 thing we have to make choice that
       
  1230 
       
  1231 268
       
  1232 00:13:56,330 --> 00:13:58,175
       
  1233 this pattern actually holds.
       
  1234 
       
  1235 269
       
  1236 00:13:58,175 --> 00:14:00,470
       
  1237 So that's, I will
       
  1238 show you next in
       
  1239 
       
  1240 270
       
  1241 00:14:00,470 --> 00:14:03,735
       
  1242 the case for n equals three.
       
  1243 
       
  1244 271
       
  1245 00:14:03,735 --> 00:14:07,810
       
  1246 If we have a wreck expression R
       
  1247 
       
  1248 272
       
  1249 00:14:07,810 --> 00:14:09,820
       
  1250 and it's followed
       
  1251 by r and another r,
       
  1252 
       
  1253 273
       
  1254 00:14:09,820 --> 00:14:11,275
       
  1255 three copies of it.
       
  1256 
       
  1257 274
       
  1258 00:14:11,275 --> 00:14:14,245
       
  1259 We can just unfold
       
  1260 again the definition.
       
  1261 
       
  1262 275
       
  1263 00:14:14,245 --> 00:14:16,930
       
  1264 So we would ask if I is nullable,
       
  1265 
       
  1266 276
       
  1267 00:14:16,930 --> 00:14:19,765
       
  1268 then we have this if branch.
       
  1269 
       
  1270 277
       
  1271 00:14:19,765 --> 00:14:23,905
       
  1272 And if i is not nullable
       
  1273 or we have this as branch.
       
  1274 
       
  1275 278
       
  1276 00:14:23,905 --> 00:14:27,010
       
  1277 Okay? What can we do here?
       
  1278 
       
  1279 279
       
  1280 00:14:27,010 --> 00:14:30,310
       
  1281 Well, we notice that
       
  1282 this one is just now
       
  1283 
       
  1284 280
       
  1285 00:14:30,310 --> 00:14:34,510
       
  1286 the derivative of two
       
  1287 Rs, one after another.
       
  1288 
       
  1289 281
       
  1290 00:14:34,510 --> 00:14:37,330
       
  1291 But this we just
       
  1292 calculated a moment ago,
       
  1293 
       
  1294 282
       
  1295 00:14:37,330 --> 00:14:40,120
       
  1296 so we can just
       
  1297 replace it by this.
       
  1298 
       
  1299 283
       
  1300 00:14:40,120 --> 00:14:43,255
       
  1301 Ok. That's what we
       
  1302 calculated earlier.
       
  1303 
       
  1304 284
       
  1305 00:14:43,255 --> 00:14:46,665
       
  1306 But now you will see
       
  1307 again this factor,
       
  1308 
       
  1309 285
       
  1310 00:14:46,665 --> 00:14:48,695
       
  1311 and this factor is the same.
       
  1312 
       
  1313 286
       
  1314 00:14:48,695 --> 00:14:52,700
       
  1315 So I can pull that
       
  1316 out to be like that.
       
  1317 
       
  1318 287
       
  1319 00:14:52,700 --> 00:14:57,380
       
  1320 And I have now followed
       
  1321 by R plus R. Oh,
       
  1322 
       
  1323 288
       
  1324 00:14:57,380 --> 00:14:59,030
       
  1325 hey, man, now you probably
       
  1326 
       
  1327 289
       
  1328 00:14:59,030 --> 00:15:00,680
       
  1329 remember how we did it earlier.
       
  1330 
       
  1331 290
       
  1332 00:15:00,680 --> 00:15:03,080
       
  1333 We can now pull out one copy of
       
  1334 
       
  1335 291
       
  1336 00:15:03,080 --> 00:15:06,020
       
  1337 this are to just get
       
  1338 something like this.
       
  1339 
       
  1340 292
       
  1341 00:15:06,020 --> 00:15:08,765
       
  1342 We have to add one essentially,
       
  1343 
       
  1344 293
       
  1345 00:15:08,765 --> 00:15:11,615
       
  1346 but we now get r plus one.
       
  1347 
       
  1348 294
       
  1349 00:15:11,615 --> 00:15:15,065
       
  1350 And this r here is
       
  1351 now just pulled out.
       
  1352 
       
  1353 295
       
  1354 00:15:15,065 --> 00:15:18,995
       
  1355 Well, here again kicks
       
  1356 in this reasoning.
       
  1357 
       
  1358 296
       
  1359 00:15:18,995 --> 00:15:22,700
       
  1360 We go in this if branch
       
  1361 only if r is nullable,
       
  1362 
       
  1363 297
       
  1364 00:15:22,700 --> 00:15:26,150
       
  1365 so on its own can already
       
  1366 match the empty string.
       
  1367 
       
  1368 298
       
  1369 00:15:26,150 --> 00:15:28,895
       
  1370 So I don't need
       
  1371 really this plus one.
       
  1372 
       
  1373 299
       
  1374 00:15:28,895 --> 00:15:30,695
       
  1375 I can just get rid of it.
       
  1376 
       
  1377 300
       
  1378 00:15:30,695 --> 00:15:33,140
       
  1379 And so I now just have
       
  1380 to rearrange the parent,
       
  1381 
       
  1382 301
       
  1383 00:15:33,140 --> 00:15:35,450
       
  1384 the thesis which we
       
  1385 said we can also do.
       
  1386 
       
  1387 302
       
  1388 00:15:35,450 --> 00:15:37,595
       
  1389 So we have something like that.
       
  1390 
       
  1391 303
       
  1392 00:15:37,595 --> 00:15:39,740
       
  1393 And here again, the
       
  1394 same reasoning,
       
  1395 
       
  1396 304
       
  1397 00:15:39,740 --> 00:15:41,975
       
  1398 we have a if condition
       
  1399 
       
  1400 305
       
  1401 00:15:41,975 --> 00:15:43,310
       
  1402 where it doesn't
       
  1403 really matter what
       
  1404 
       
  1405 306
       
  1406 00:15:43,310 --> 00:15:44,405
       
  1407 we're going to return,
       
  1408 
       
  1409 307
       
  1410 00:15:44,405 --> 00:15:46,205
       
  1411 it's in both branches the same.
       
  1412 
       
  1413 308
       
  1414 00:15:46,205 --> 00:15:48,470
       
  1415 So we can just
       
  1416 replace it by that.
       
  1417 
       
  1418 309
       
  1419 00:15:48,470 --> 00:15:51,920
       
  1420 And yes, now we can be
       
  1421 pretty sure they'll
       
  1422 
       
  1423 310
       
  1424 00:15:51,920 --> 00:15:55,310
       
  1425 work for all the n times
       
  1426 regular expressions.
       
  1427 
       
  1428 311
       
  1429 00:15:55,310 --> 00:15:57,860
       
  1430 And leave that to
       
  1431 the calculation for
       
  1432 
       
  1433 312
       
  1434 00:15:57,860 --> 00:16:02,570
       
  1435 maybe R to the four to you.
       
  1436 
       
  1437 313
       
  1438 00:16:02,570 --> 00:16:04,490
       
  1439 And the reason why I do it
       
  1440 
       
  1441 314
       
  1442 00:16:04,490 --> 00:16:06,605
       
  1443 in such a detail,
       
  1444 this calculation,
       
  1445 
       
  1446 315
       
  1447 00:16:06,605 --> 00:16:08,960
       
  1448 this is really meant
       
  1449 to help you with
       
  1450 
       
  1451 316
       
  1452 00:16:08,960 --> 00:16:13,200
       
  1453 the coursework which is
       
  1454 coming up this week.
       
  1455 
       
  1456 317
       
  1457 00:16:13,210 --> 00:16:16,250
       
  1458 I'm now back in our
       
  1459 implementation.
       
  1460 
       
  1461 318
       
  1462 00:16:16,250 --> 00:16:20,660
       
  1463 In this Reto, said We have
       
  1464 this explicit constructor now
       
  1465 
       
  1466 319
       
  1467 00:16:20,660 --> 00:16:25,535
       
  1468 for n times b can now fill
       
  1469 in the cases for nullable.
       
  1470 
       
  1471 320
       
  1472 00:16:25,535 --> 00:16:27,635
       
  1473 So if we have R in times,
       
  1474 
       
  1475 321
       
  1476 00:16:27,635 --> 00:16:30,995
       
  1477 if this n is equal to
       
  1478 0, we return true.
       
  1479 
       
  1480 322
       
  1481 00:16:30,995 --> 00:16:34,190
       
  1482 Otherwise we have to test
       
  1483 whether eyes nullable.
       
  1484 
       
  1485 323
       
  1486 00:16:34,190 --> 00:16:37,565
       
  1487 And in the derivative case, oi,
       
  1488 
       
  1489 324
       
  1490 00:16:37,565 --> 00:16:40,339
       
  1491 if this n is equal to 0,
       
  1492 
       
  1493 325
       
  1494 00:16:40,339 --> 00:16:43,564
       
  1495 the return this 0
       
  1496 wreck expression.
       
  1497 
       
  1498 326
       
  1499 00:16:43,564 --> 00:16:46,700
       
  1500 Otherwise we return the
       
  1501 sequence of the derivative
       
  1502 
       
  1503 327
       
  1504 00:16:46,700 --> 00:16:50,270
       
  1505 of psi of r four by n times of r,
       
  1506 
       
  1507 328
       
  1508 00:16:50,270 --> 00:16:54,545
       
  1509 but n minus one times and
       
  1510 everything else stays the same.
       
  1511 
       
  1512 329
       
  1513 00:16:54,545 --> 00:16:56,510
       
  1514 Here's the function for strings,
       
  1515 
       
  1516 330
       
  1517 00:16:56,510 --> 00:16:58,430
       
  1518 derivative function for strings.
       
  1519 
       
  1520 331
       
  1521 00:16:58,430 --> 00:17:01,595
       
  1522 In the main mantra
       
  1523 function as all the same.
       
  1524 
       
  1525 332
       
  1526 00:17:01,595 --> 00:17:04,820
       
  1527 We still require this
       
  1528 definition because
       
  1529 
       
  1530 333
       
  1531 00:17:04,820 --> 00:17:06,050
       
  1532 we haven't done anything about
       
  1533 
       
  1534 334
       
  1535 00:17:06,050 --> 00:17:08,090
       
  1536 the optional record
       
  1537 expression yet.
       
  1538 
       
  1539 335
       
  1540 00:17:08,090 --> 00:17:10,670
       
  1541 And we have now at this
       
  1542 
       
  1543 336
       
  1544 00:17:10,670 --> 00:17:13,250
       
  1545 either warm and evil
       
  1546 2-break expression.
       
  1547 
       
  1548 337
       
  1549 00:17:13,250 --> 00:17:15,290
       
  1550 And let's test it. And let's be
       
  1551 
       
  1552 338
       
  1553 00:17:15,290 --> 00:17:17,000
       
  1554 a bit more ambitious.
       
  1555 Be testing it.
       
  1556 
       
  1557 339
       
  1558 00:17:17,000 --> 00:17:20,315
       
  1559 The strings between 01000
       
  1560 
       
  1561 340
       
  1562 00:17:20,315 --> 00:17:22,580
       
  1563 and let's see what the time say.
       
  1564 
       
  1565 341
       
  1566 00:17:22,580 --> 00:17:26,210
       
  1567 I'm testing this again
       
  1568 inside the ammonite rebel.
       
  1569 
       
  1570 342
       
  1571 00:17:26,210 --> 00:17:30,180
       
  1572 And you'll see it should
       
  1573 be now much quicker.
       
  1574 
       
  1575 343
       
  1576 00:17:30,610 --> 00:17:34,640
       
  1577 Okay, it might slow
       
  1578 down also around 600.
       
  1579 
       
  1580 344
       
  1581 00:17:34,640 --> 00:17:40,490
       
  1582 700 needs two seconds,
       
  1583 three seconds, 4800.
       
  1584 
       
  1585 345
       
  1586 00:17:40,490 --> 00:17:43,940
       
  1587 Let's see about it
       
  1588 needs 4 thousand.
       
  1589 
       
  1590 346
       
  1591 00:17:43,940 --> 00:17:47,435
       
  1592 But you have to remember
       
  1593 Ruby and Python.
       
  1594 
       
  1595 347
       
  1596 00:17:47,435 --> 00:17:51,530
       
  1597 They needed half a
       
  1598 minute to just 43 TAs.
       
  1599 
       
  1600 348
       
  1601 00:17:51,530 --> 00:17:54,485
       
  1602 We need a little bit
       
  1603 more than six seconds,
       
  1604 
       
  1605 349
       
  1606 00:17:54,485 --> 00:17:57,110
       
  1607 but we are processing a string of
       
  1608 
       
  1609 350
       
  1610 00:17:57,110 --> 00:18:00,575
       
  1611 1000 days so that success.
       
  1612 
       
  1613 351
       
  1614 00:18:00,575 --> 00:18:04,775
       
  1615 So this speed is also explained
       
  1616 if you look at the sizes.
       
  1617 
       
  1618 352
       
  1619 00:18:04,775 --> 00:18:08,975
       
  1620 Since we now have this explicit
       
  1621 and times constructor,
       
  1622 
       
  1623 353
       
  1624 00:18:08,975 --> 00:18:11,930
       
  1625 it doesn't really matter
       
  1626 how big this n is.
       
  1627 
       
  1628 354
       
  1629 00:18:11,930 --> 00:18:14,540
       
  1630 This evil one reg
       
  1631 expression will always be
       
  1632 
       
  1633 355
       
  1634 00:18:14,540 --> 00:18:17,195
       
  1635 of this size seven,
       
  1636 the beginning.
       
  1637 
       
  1638 356
       
  1639 00:18:17,195 --> 00:18:20,315
       
  1640 And you can also see if you
       
  1641 now build the derivatives,
       
  1642 
       
  1643 357
       
  1644 00:18:20,315 --> 00:18:22,550
       
  1645 they still grow in size,
       
  1646 
       
  1647 358
       
  1648 00:18:22,550 --> 00:18:24,740
       
  1649 but much more moderately.
       
  1650 
       
  1651 359
       
  1652 00:18:24,740 --> 00:18:28,100
       
  1653 And let's try out this
       
  1654 example, this 20 a.
       
  1655 
       
  1656 360
       
  1657 00:18:28,100 --> 00:18:31,685
       
  1658 So we build the derivative
       
  1659 according to 20 character A's.
       
  1660 
       
  1661 361
       
  1662 00:18:31,685 --> 00:18:33,890
       
  1663 In the earlier example,
       
  1664 
       
  1665 362
       
  1666 00:18:33,890 --> 00:18:35,780
       
  1667 we ended up this a
       
  1668 wreck expression which
       
  1669 
       
  1670 363
       
  1671 00:18:35,780 --> 00:18:37,895
       
  1672 had like 8 million plus nodes.
       
  1673 
       
  1674 364
       
  1675 00:18:37,895 --> 00:18:39,830
       
  1676 And if we do this now,
       
  1677 
       
  1678 365
       
  1679 00:18:39,830 --> 00:18:43,205
       
  1680 then we just have a wreck
       
  1681 expression with 211 nodes.
       
  1682 
       
  1683 366
       
  1684 00:18:43,205 --> 00:18:44,750
       
  1685 And that is much smaller and
       
  1686 
       
  1687 367
       
  1688 00:18:44,750 --> 00:18:47,165
       
  1689 all the calculations
       
  1690 would be much quicker.
       
  1691 
       
  1692 368
       
  1693 00:18:47,165 --> 00:18:49,550
       
  1694 So yeah, there's
       
  1695 
       
  1696 369
       
  1697 00:18:49,550 --> 00:18:52,205
       
  1698 this push off CKY algorithm
       
  1699 and this improvement.
       
  1700 
       
  1701 370
       
  1702 00:18:52,205 --> 00:18:54,890
       
  1703 We're now running
       
  1704 circles around Ruby and
       
  1705 
       
  1706 371
       
  1707 00:18:54,890 --> 00:18:58,445
       
  1708 Python because they're just
       
  1709 stuck here at the beginning.
       
  1710 
       
  1711 372
       
  1712 00:18:58,445 --> 00:19:00,230
       
  1713 Because they need already
       
  1714 
       
  1715 373
       
  1716 00:19:00,230 --> 00:19:02,975
       
  1717 like half a minute
       
  1718 for just 30 a's.
       
  1719 
       
  1720 374
       
  1721 00:19:02,975 --> 00:19:05,930
       
  1722 We now can do something
       
  1723 like thousand a's.
       
  1724 
       
  1725 375
       
  1726 00:19:05,930 --> 00:19:07,580
       
  1727 And in reasonable time.
       
  1728 
       
  1729 376
       
  1730 00:19:07,580 --> 00:19:09,740
       
  1731 I think this must be
       
  1732 timing I obtained with
       
  1733 
       
  1734 377
       
  1735 00:19:09,740 --> 00:19:12,635
       
  1736 my older laptop some time ago.
       
  1737 
       
  1738 378
       
  1739 00:19:12,635 --> 00:19:14,210
       
  1740 Because remember we
       
  1741 had something like
       
  1742 
       
  1743 379
       
  1744 00:19:14,210 --> 00:19:16,670
       
  1745 six seconds here, it says 15.
       
  1746 
       
  1747 380
       
  1748 00:19:16,670 --> 00:19:18,080
       
  1749 So you always have to take
       
  1750 
       
  1751 381
       
  1752 00:19:18,080 --> 00:19:20,885
       
  1753 these times with
       
  1754 the pinch of salt.
       
  1755 
       
  1756 382
       
  1757 00:19:20,885 --> 00:19:22,670
       
  1758 It's essentially only the trend,
       
  1759 
       
  1760 383
       
  1761 00:19:22,670 --> 00:19:25,100
       
  1762 but it's clear we are
       
  1763 much, much better.
       
  1764 
       
  1765 384
       
  1766 00:19:25,100 --> 00:19:27,065
       
  1767 So we have worked a lot,
       
  1768 
       
  1769 385
       
  1770 00:19:27,065 --> 00:19:30,720
       
  1771 but we also got something
       
  1772 for it in return.