videos/02-algo2.srt
changeset 769 f9686b22db7e
child 772 3bf3f5bb067e
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/videos/02-algo2.srt	Sat Oct 03 00:51:47 2020 +0100
@@ -0,0 +1,1772 @@
+1
+00:00:06,020 --> 00:00:09,945
+They come back after
+this disappointment
+
+2
+00:00:09,945 --> 00:00:14,115
+and case of over promising
+and under achieving.
+
+3
+00:00:14,115 --> 00:00:17,295
+Let's have a look whether
+we can recover from that.
+
+4
+00:00:17,295 --> 00:00:20,535
+So here's one problem.
+
+5
+00:00:20,535 --> 00:00:23,790
+Then we looked at this end
+times vk expression and
+
+6
+00:00:23,790 --> 00:00:27,330
+be able to represent
+that directly.
+
+7
+00:00:27,330 --> 00:00:31,275
+We had two represented as a
+sequence record expression.
+
+8
+00:00:31,275 --> 00:00:32,850
+So we expanded it.
+
+9
+00:00:32,850 --> 00:00:36,510
+So times one would be just a,
+
+10
+00:00:36,510 --> 00:00:40,595
+t, times two would
+be a, and so on.
+
+11
+00:00:40,595 --> 00:00:43,535
+And so you can see if
+you go already to 13,
+
+12
+00:00:43,535 --> 00:00:47,960
+N equals 13, the record
+expression becomes quite large.
+
+13
+00:00:47,960 --> 00:00:52,865
+And now the functions
+nullable and also derivative.
+
+14
+00:00:52,865 --> 00:00:56,360
+They need to traverse
+this reg expression.
+
+15
+00:00:56,360 --> 00:00:59,060
+Remember this tree in sometimes
+
+16
+00:00:59,060 --> 00:01:01,820
+they have to traverse
+that even several times.
+
+17
+00:01:01,820 --> 00:01:04,535
+So that will slow
+down the algorithm.
+
+18
+00:01:04,535 --> 00:01:08,150
+And in particular, because
+our first reg expression was
+
+19
+00:01:08,150 --> 00:01:11,840
+actually not just a bot
+a plus one. So b hat.
+
+20
+00:01:11,840 --> 00:01:14,330
+In the case of 13,
+we had a plus one,
+
+21
+00:01:14,330 --> 00:01:17,330
+a plus one a plus born 13 times.
+
+22
+00:01:17,330 --> 00:01:20,150
+And this reg
+expression just grows,
+
+23
+00:01:20,150 --> 00:01:25,475
+stay in number n. Just to
+show you this also encode,
+
+24
+00:01:25,475 --> 00:01:28,115
+I'm Stern, This File rewarm
+
+25
+00:01:28,115 --> 00:01:30,920
+and D and I have a size function.
+
+26
+00:01:30,920 --> 00:01:33,140
+The size function takes
+a regular expression as
+
+27
+00:01:33,140 --> 00:01:36,215
+argument and produces in teacher.
+
+28
+00:01:36,215 --> 00:01:39,980
+And essentially it takes
+this rec expression scene as
+
+29
+00:01:39,980 --> 00:01:44,045
+a tree and count how many
+nodes are in this tree.
+
+30
+00:01:44,045 --> 00:01:49,490
+And so if I take this even
+one reg expression, this one,
+
+31
+00:01:49,490 --> 00:01:52,160
+and increase now this n. So you
+
+32
+00:01:52,160 --> 00:01:54,680
+can already see
+for any cross one,
+
+33
+00:01:54,680 --> 00:01:56,540
+the size of this
+record expression is
+
+34
+00:01:56,540 --> 00:01:59,615
+five and you see it
+slowly increases.
+
+35
+00:01:59,615 --> 00:02:02,150
+And this 20 n equals 20.
+
+36
+00:02:02,150 --> 00:02:07,100
+The size of this record
+expression is SMOW already a 119.
+
+37
+00:02:07,100 --> 00:02:10,145
+So now the interesting
+line as this one.
+
+38
+00:02:10,145 --> 00:02:13,295
+Remember it when we
+built the derivative,
+
+39
+00:02:13,295 --> 00:02:17,150
+very often parts of a reg
+expression gets copied.
+
+40
+00:02:17,150 --> 00:02:19,280
+So if you have like EVA,
+
+41
+00:02:19,280 --> 00:02:22,325
+one of 2019 nodes,
+
+42
+00:02:22,325 --> 00:02:25,475
+now this will be often copied.
+
+43
+00:02:25,475 --> 00:02:31,475
+And we want to see what's the
+resulting tree look like,
+
+44
+00:02:31,475 --> 00:02:32,780
+how many nodes it has.
+
+45
+00:02:32,780 --> 00:02:34,985
+So I take here either one of 20
+
+46
+00:02:34,985 --> 00:02:38,405
+and the derivative
+according to 20 a's.
+
+47
+00:02:38,405 --> 00:02:41,820
+And just have a look
+what the size is.
+
+48
+00:02:43,210 --> 00:02:45,680
+Okay, that takes away.
+
+49
+00:02:45,680 --> 00:02:48,410
+You see now this rec expression,
+
+50
+00:02:48,410 --> 00:02:52,280
+the derivative has already
+8 million plus nodes.
+
+51
+00:02:52,280 --> 00:02:55,400
+And now nullable and
+derivative have to
+
+52
+00:02:55,400 --> 00:02:58,430
+traverse these trees with
+a million plus nodes.
+
+53
+00:02:58,430 --> 00:03:01,400
+So it's no wonder
+that this is slow.
+
+54
+00:03:01,400 --> 00:03:03,860
+The first thing we
+can try to mitigate
+
+55
+00:03:03,860 --> 00:03:06,350
+this explosion problem is to
+
+56
+00:03:06,350 --> 00:03:09,050
+have an explicit and
+times reg expression.
+
+57
+00:03:09,050 --> 00:03:11,600
+So instead of expanding
+
+58
+00:03:11,600 --> 00:03:13,415
+it to the sequence
+reg expression,
+
+59
+00:03:13,415 --> 00:03:17,510
+let's just have a single
+wreck expression and times,
+
+60
+00:03:17,510 --> 00:03:20,540
+which takes an expression and
+
+61
+00:03:20,540 --> 00:03:24,470
+a number and keeps that
+regular expression Small.
+
+62
+00:03:24,470 --> 00:03:27,095
+I'm here in the fire V2,
+
+63
+00:03:27,095 --> 00:03:29,090
+which is also on Keats.
+
+64
+00:03:29,090 --> 00:03:32,570
+And the only change I made
+is I added explicitly
+
+65
+00:03:32,570 --> 00:03:33,860
+a wrecker expression for
+
+66
+00:03:33,860 --> 00:03:36,770
+N times the optional
+reg expression.
+
+67
+00:03:36,770 --> 00:03:39,920
+Very leaf out at the
+moment because this a
+
+68
+00:03:39,920 --> 00:03:41,975
+optional is represented as
+
+69
+00:03:41,975 --> 00:03:44,645
+a plus one and doesn't
+explain too much.
+
+70
+00:03:44,645 --> 00:03:47,375
+The really the culprit
+is this n times.
+
+71
+00:03:47,375 --> 00:03:51,095
+And instead of expanding
+it n times, which has,
+
+72
+00:03:51,095 --> 00:03:54,275
+take here a wreck expression
+and a natural number,
+
+73
+00:03:54,275 --> 00:03:56,960
+which says how many times
+it should be repeated.
+
+74
+00:03:56,960 --> 00:03:59,165
+And in this week we
+can keep it small.
+
+75
+00:03:59,165 --> 00:04:01,370
+But by adding that
+reg expression,
+
+76
+00:04:01,370 --> 00:04:05,150
+we now have to think about
+cases for nullable and
+
+77
+00:04:05,150 --> 00:04:07,340
+derivative so that we have
+
+78
+00:04:07,340 --> 00:04:10,205
+to now calculate next
+what they look like.
+
+79
+00:04:10,205 --> 00:04:14,060
+We can actually
+calculate what the rule
+
+80
+00:04:14,060 --> 00:04:17,525
+for nullable should be from
+how we defined it earlier.
+
+81
+00:04:17,525 --> 00:04:19,580
+Remember, if you have
+a regular expression,
+
+82
+00:04:19,580 --> 00:04:21,980
+R and B take in
+times of this are,
+
+83
+00:04:21,980 --> 00:04:25,220
+then indicates if n equals 0,
+
+84
+00:04:25,220 --> 00:04:28,130
+we find that as the
+one reg expression.
+
+85
+00:04:28,130 --> 00:04:30,380
+If n equals one,
+
+86
+00:04:30,380 --> 00:04:32,495
+it will be just a
+single copy of on.
+
+87
+00:04:32,495 --> 00:04:33,725
+If n equals two,
+
+88
+00:04:33,725 --> 00:04:35,270
+it will be two copies.
+
+89
+00:04:35,270 --> 00:04:39,260
+Sequence if three, we have
+three copies and so on.
+
+90
+00:04:39,260 --> 00:04:41,390
+Now we have to
+somehow characterize
+
+91
+00:04:41,390 --> 00:04:44,285
+when these cases all nullable.
+
+92
+00:04:44,285 --> 00:04:47,810
+Well, if n equals 0,
+
+93
+00:04:47,810 --> 00:04:49,850
+then this will be defined as one,
+
+94
+00:04:49,850 --> 00:04:52,100
+so one can match
+the empty string.
+
+95
+00:04:52,100 --> 00:04:54,785
+So that should be
+definitely true.
+
+96
+00:04:54,785 --> 00:04:57,725
+Okay, if any cross one,
+
+97
+00:04:57,725 --> 00:05:00,470
+when can this reg expression
+match the empty string?
+
+98
+00:05:00,470 --> 00:05:02,000
+So vent should be nullable.
+
+99
+00:05:02,000 --> 00:05:07,535
+Well, if nullable of our holes,
+
+100
+00:05:07,535 --> 00:05:09,860
+okay, so if I can match
+the empty string,
+
+101
+00:05:09,860 --> 00:05:12,110
+then we can match
+the empty string.
+
+102
+00:05:12,110 --> 00:05:14,870
+When can this regular expression
+match the empty string?
+
+103
+00:05:14,870 --> 00:05:16,265
+So these are now two copies of
+
+104
+00:05:16,265 --> 00:05:20,690
+our well-posed copies need
+to match the empty string.
+
+105
+00:05:20,690 --> 00:05:22,820
+So again, we have to ask
+
+106
+00:05:22,820 --> 00:05:25,774
+both of them need to be nullable.
+
+107
+00:05:25,774 --> 00:05:28,520
+Ok? Similarly, in the third case,
+
+108
+00:05:28,520 --> 00:05:30,710
+all three copies need to be
+
+109
+00:05:30,710 --> 00:05:33,230
+able to match the empty
+string. Men, is that the case?
+
+110
+00:05:33,230 --> 00:05:38,360
+Well, if nullable of
+our holes and so on.
+
+111
+00:05:38,360 --> 00:05:48,754
+So we can actually define that
+if n equals 0, then true.
+
+112
+00:05:48,754 --> 00:05:56,045
+Else we have to ask with
+nullable of our holes.
+
+113
+00:05:56,045 --> 00:05:58,550
+So that will be the clause we
+
+114
+00:05:58,550 --> 00:06:01,625
+have to implement for
+n times and nullable.
+
+115
+00:06:01,625 --> 00:06:04,280
+Deriving the definition for
+
+116
+00:06:04,280 --> 00:06:06,920
+the derivative of the n terms
+
+117
+00:06:06,920 --> 00:06:10,175
+reg expression.
+It's not that easy.
+
+118
+00:06:10,175 --> 00:06:12,755
+So we have to look again
+how it was defined
+
+119
+00:06:12,755 --> 00:06:16,010
+earlier and we somehow
+have to spot a pattern.
+
+120
+00:06:16,010 --> 00:06:18,380
+The voice, our
+algorithm will be again
+
+121
+00:06:18,380 --> 00:06:20,975
+quite slow if you don't
+spot that pattern.
+
+122
+00:06:20,975 --> 00:06:22,550
+So let's have a look.
+
+123
+00:06:22,550 --> 00:06:26,240
+So in the case that
+n is equal to 0,
+
+124
+00:06:26,240 --> 00:06:29,885
+then r n times most
+defined as just one.
+
+125
+00:06:29,885 --> 00:06:32,030
+What would be the
+derivative of one?
+
+126
+00:06:32,030 --> 00:06:36,140
+So the derivative of c of one.
+
+127
+00:06:36,140 --> 00:06:38,990
+Peaches defined as 0.
+
+128
+00:06:38,990 --> 00:06:41,465
+Okay, fair enough.
+
+129
+00:06:41,465 --> 00:06:44,960
+If he have any cross one,
+
+130
+00:06:44,960 --> 00:06:48,125
+then we just have one copy
+of this regular expression.
+
+131
+00:06:48,125 --> 00:06:50,120
+So there's not much as we can do.
+
+132
+00:06:50,120 --> 00:06:53,735
+We would have to calculate
+the derivative of air are.
+
+133
+00:06:53,735 --> 00:06:57,395
+Now in the n equals two case.
+
+134
+00:06:57,395 --> 00:07:00,245
+That would mean we
+have two copies of
+
+135
+00:07:00,245 --> 00:07:03,425
+R. And they would be a sequence.
+
+136
+00:07:03,425 --> 00:07:07,775
+How would be the derivative C of
+
+137
+00:07:07,775 --> 00:07:10,340
+R four by R be
+
+138
+00:07:10,340 --> 00:07:13,265
+defined where we would
+look up the definition.
+
+139
+00:07:13,265 --> 00:07:15,470
+And it would say that's
+the complicated case
+
+140
+00:07:15,470 --> 00:07:16,760
+d sequence or
+
+141
+00:07:16,760 --> 00:07:21,875
+if null a bowl of R,
+
+142
+00:07:21,875 --> 00:07:25,250
+Then the most complicated case.
+
+143
+00:07:25,250 --> 00:07:28,225
+Else, That's the easy
+case that would be
+
+144
+00:07:28,225 --> 00:07:32,660
+derivative of c of R four
+
+145
+00:07:32,660 --> 00:07:35,540
+by R. And then I
+just have to copy
+
+146
+00:07:35,540 --> 00:07:40,775
+that derivative of C
+of four by r here.
+
+147
+00:07:40,775 --> 00:07:43,130
+But in this case I have also
+
+148
+00:07:43,130 --> 00:07:51,145
+the alternative derivative
+of c of r. And from that,
+
+149
+00:07:51,145 --> 00:07:55,030
+it looks like we can
+do much better than
+
+150
+00:07:55,030 --> 00:07:58,390
+that unless we do
+
+151
+00:07:58,390 --> 00:08:02,560
+a little bit of magic and the
+magic to do with this case.
+
+152
+00:08:02,560 --> 00:08:07,420
+So let me look at this
+case a bit more closely.
+
+153
+00:08:07,420 --> 00:08:09,819
+Let me go back to slides
+
+154
+00:08:09,819 --> 00:08:12,940
+because actually the calculation
+might be a bit hairy.
+
+155
+00:08:12,940 --> 00:08:16,945
+So remember we are in this
+case where n equals two.
+
+156
+00:08:16,945 --> 00:08:18,550
+And this was defined as
+
+157
+00:08:18,550 --> 00:08:20,680
+this sequence work
+expression R followed
+
+158
+00:08:20,680 --> 00:08:23,080
+by r. And the question was,
+
+159
+00:08:23,080 --> 00:08:26,365
+can we somehow make sense
+out of this derivative
+
+160
+00:08:26,365 --> 00:08:30,370
+where we have to build the
+derivative of a sequence.
+
+161
+00:08:30,370 --> 00:08:33,020
+So features the definition.
+
+162
+00:08:33,020 --> 00:08:36,050
+We would ask if
+the r is nullable,
+
+163
+00:08:36,050 --> 00:08:39,095
+In this case, we return
+this alternative.
+
+164
+00:08:39,095 --> 00:08:40,640
+And if it's not nullable,
+
+165
+00:08:40,640 --> 00:08:44,135
+It is just this
+record expression.
+
+166
+00:08:44,135 --> 00:08:49,550
+Now my claim is that
+this whole if condition
+
+167
+00:08:49,550 --> 00:08:55,895
+can be replaced by just this
+simple derivative here.
+
+168
+00:08:55,895 --> 00:08:58,775
+And if that claim is correct,
+
+169
+00:08:58,775 --> 00:09:01,520
+there would be very nice
+because in contrast to
+
+170
+00:09:01,520 --> 00:09:04,130
+this if condition where
+
+171
+00:09:04,130 --> 00:09:06,280
+we have to calculate
+like nullable,
+
+172
+00:09:06,280 --> 00:09:08,330
+decide in which branch we are.
+
+173
+00:09:08,330 --> 00:09:10,580
+We don't have to
+calculate your now,
+
+174
+00:09:10,580 --> 00:09:13,880
+but we can just replace
+it by this expression.
+
+175
+00:09:13,880 --> 00:09:16,670
+So if we can
+substantiate that claim,
+
+176
+00:09:16,670 --> 00:09:19,860
+that will be definitely
+good file algorithm.
+
+177
+00:09:20,140 --> 00:09:22,775
+And to substantiate that,
+
+178
+00:09:22,775 --> 00:09:26,795
+I will focus on this
+record expression here.
+
+179
+00:09:26,795 --> 00:09:31,100
+And notice that this record
+expression the only be
+
+180
+00:09:31,100 --> 00:09:35,780
+called or only be generated
+if r is nullable.
+
+181
+00:09:35,780 --> 00:09:38,075
+So in any other case,
+
+182
+00:09:38,075 --> 00:09:40,060
+I will actually not go into it
+
+183
+00:09:40,060 --> 00:09:43,850
+that if branch and would
+be in the other one.
+
+184
+00:09:43,850 --> 00:09:45,260
+So if we are in this if branch,
+
+185
+00:09:45,260 --> 00:09:47,705
+we definitely know
+that R is nullable.
+
+186
+00:09:47,705 --> 00:09:52,955
+Okay? Okay, so here's
+our regular expression.
+
+187
+00:09:52,955 --> 00:09:55,940
+And we know it's nullable.
+
+188
+00:09:55,940 --> 00:09:57,920
+So we have to somehow find
+
+189
+00:09:57,920 --> 00:10:00,380
+an equivalent expression that
+
+190
+00:10:00,380 --> 00:10:04,100
+is smaller or simpler
+than that one.
+
+191
+00:10:04,100 --> 00:10:05,945
+Let's see what we can do.
+
+192
+00:10:05,945 --> 00:10:10,160
+So the first thing
+actually is we multiplying
+
+193
+00:10:10,160 --> 00:10:16,595
+this right hand side of the
+alternative is times one.
+
+194
+00:10:16,595 --> 00:10:19,700
+We can do that
+because this does not
+
+195
+00:10:19,700 --> 00:10:23,090
+change which strings this
+work expression can match.
+
+196
+00:10:23,090 --> 00:10:25,685
+Remember we even had it
+as a simplification row,
+
+197
+00:10:25,685 --> 00:10:27,425
+just in this case B,
+
+198
+00:10:27,425 --> 00:10:29,705
+don't apply it as a
+simplification will
+
+199
+00:10:29,705 --> 00:10:31,310
+actually make this
+work expression
+
+200
+00:10:31,310 --> 00:10:32,720
+a bit more complicated.
+
+201
+00:10:32,720 --> 00:10:34,910
+But times one doesn't make
+
+202
+00:10:34,910 --> 00:10:37,820
+a difference because it
+means the end of the string,
+
+203
+00:10:37,820 --> 00:10:40,070
+we still want to match
+the empty string.
+
+204
+00:10:40,070 --> 00:10:42,155
+Okay, so that is possible.
+
+205
+00:10:42,155 --> 00:10:45,740
+I can do that. Once
+we have done that,
+
+206
+00:10:45,740 --> 00:10:48,410
+you will notice that this
+factor derivative of
+
+207
+00:10:48,410 --> 00:10:51,860
+stuff are exactly the
+same as that one.
+
+208
+00:10:51,860 --> 00:10:54,650
+And in between is a plus.
+
+209
+00:10:54,650 --> 00:10:57,440
+So you probably remember the law
+
+210
+00:10:57,440 --> 00:11:00,170
+from school math
+that I can pull out
+
+211
+00:11:00,170 --> 00:11:02,735
+this factor derivative of c of r.
+
+212
+00:11:02,735 --> 00:11:06,320
+And I'm inside the parentheses
+is left with that.
+
+213
+00:11:06,320 --> 00:11:09,245
+So now I have r plus one.
+
+214
+00:11:09,245 --> 00:11:13,055
+Usually we cannot
+simplify r plus one.
+
+215
+00:11:13,055 --> 00:11:15,530
+If it had been R
+plus 0, then yes,
+
+216
+00:11:15,530 --> 00:11:18,665
+we could have got rid of the CRO.
+
+217
+00:11:18,665 --> 00:11:21,590
+Plus one often
+makes a difference,
+
+218
+00:11:21,590 --> 00:11:22,970
+but not in our case.
+
+219
+00:11:22,970 --> 00:11:25,940
+Remember, we know that
+this R is nullable,
+
+220
+00:11:25,940 --> 00:11:29,840
+so on its own can already
+match the empty string.
+
+221
+00:11:29,840 --> 00:11:33,305
+So we don't really need this
+alternative plus one there,
+
+222
+00:11:33,305 --> 00:11:35,300
+so we can just get rid of that.
+
+223
+00:11:35,300 --> 00:11:37,070
+Okay, and so now we have
+
+224
+00:11:37,070 --> 00:11:40,535
+a much simpler wound
+reg expression.
+
+225
+00:11:40,535 --> 00:11:44,285
+And this actually helps a
+lot for our if condition.
+
+226
+00:11:44,285 --> 00:11:46,925
+Look, this was the
+original if condition
+
+227
+00:11:46,925 --> 00:11:50,270
+and this is direct expression
+h. We just simplified.
+
+228
+00:11:50,270 --> 00:11:53,105
+If we replace it with this one,
+
+229
+00:11:53,105 --> 00:11:56,090
+then we just end up with this.
+
+230
+00:11:56,090 --> 00:11:59,510
+And now you will see that
+the if condition is actually
+
+231
+00:11:59,510 --> 00:12:02,750
+pointless because you
+check if it's null above,
+
+232
+00:12:02,750 --> 00:12:05,060
+we return this reg
+expression or it's
+
+233
+00:12:05,060 --> 00:12:08,210
+not nullable and we
+return exactly the same.
+
+234
+00:12:08,210 --> 00:12:10,025
+That doesn't make any difference.
+
+235
+00:12:10,025 --> 00:12:11,750
+So we can just get rid of
+
+236
+00:12:11,750 --> 00:12:14,645
+that one and can
+replace it by that.
+
+237
+00:12:14,645 --> 00:12:16,865
+And you see, this is now
+
+238
+00:12:16,865 --> 00:12:20,720
+a much simpler case than
+what we had before.
+
+239
+00:12:20,720 --> 00:12:24,170
+So let's take stock
+what we have so far.
+
+240
+00:12:24,170 --> 00:12:26,915
+So we know India CEO case,
+
+241
+00:12:26,915 --> 00:12:30,920
+the derivative needs
+to be defined as 0.
+
+242
+00:12:30,920 --> 00:12:33,095
+So because we define this
+
+243
+00:12:33,095 --> 00:12:36,785
+and times as one,
+the derivative is 0.
+
+244
+00:12:36,785 --> 00:12:39,889
+For chest r, this will
+be the derivative.
+
+245
+00:12:39,889 --> 00:12:42,170
+And we can't do any
+better than that
+
+246
+00:12:42,170 --> 00:12:45,620
+for our followed by
+RB just found out.
+
+247
+00:12:45,620 --> 00:12:47,270
+Actually it is quite simple.
+
+248
+00:12:47,270 --> 00:12:51,410
+Reg expression is equivalent
+to the derivative.
+
+249
+00:12:51,410 --> 00:12:53,870
+Now, we have to continue with
+
+250
+00:12:53,870 --> 00:12:56,090
+that case where n is
+equal to three and we
+
+251
+00:12:56,090 --> 00:12:58,099
+now have three copies
+
+252
+00:12:58,099 --> 00:13:02,390
+of this or what should
+be the derivative?
+
+253
+00:13:02,390 --> 00:13:05,330
+Well, if you look very carefully
+
+254
+00:13:05,330 --> 00:13:08,465
+at this emerging pattern,
+
+255
+00:13:08,465 --> 00:13:12,410
+I have to say then
+what would be nice if,
+
+256
+00:13:12,410 --> 00:13:16,400
+if he could show that in
+the n equals three case,
+
+257
+00:13:16,400 --> 00:13:18,275
+we end up with this.
+
+258
+00:13:18,275 --> 00:13:21,290
+Because then what we have is.
+
+259
+00:13:21,290 --> 00:13:25,370
+We can define our
+nullable for n times
+
+260
+00:13:25,370 --> 00:13:31,025
+s. If any cross 0 then
+true as nullable.
+
+261
+00:13:31,025 --> 00:13:33,875
+And for the derivative,
+
+262
+00:13:33,875 --> 00:13:37,190
+we can save if n is equal to 0,
+
+263
+00:13:37,190 --> 00:13:40,235
+then we return the
+Sierra reg expression.
+
+264
+00:13:40,235 --> 00:13:43,295
+Otherwise, as you can see
+from this pattern here,
+
+265
+00:13:43,295 --> 00:13:50,735
+it would be derivative of
+c r four by r n minus one.
+
+266
+00:13:50,735 --> 00:13:54,770
+Okay? And the only
+
+267
+00:13:54,770 --> 00:13:56,330
+thing we have to make choice that
+
+268
+00:13:56,330 --> 00:13:58,175
+this pattern actually holds.
+
+269
+00:13:58,175 --> 00:14:00,470
+So that's, I will
+show you next in
+
+270
+00:14:00,470 --> 00:14:03,735
+the case for n equals three.
+
+271
+00:14:03,735 --> 00:14:07,810
+If we have a wreck expression R
+
+272
+00:14:07,810 --> 00:14:09,820
+and it's followed
+by r and another r,
+
+273
+00:14:09,820 --> 00:14:11,275
+three copies of it.
+
+274
+00:14:11,275 --> 00:14:14,245
+We can just unfold
+again the definition.
+
+275
+00:14:14,245 --> 00:14:16,930
+So we would ask if I is nullable,
+
+276
+00:14:16,930 --> 00:14:19,765
+then we have this if branch.
+
+277
+00:14:19,765 --> 00:14:23,905
+And if i is not nullable
+or we have this as branch.
+
+278
+00:14:23,905 --> 00:14:27,010
+Okay? What can we do here?
+
+279
+00:14:27,010 --> 00:14:30,310
+Well, we notice that
+this one is just now
+
+280
+00:14:30,310 --> 00:14:34,510
+the derivative of two
+Rs, one after another.
+
+281
+00:14:34,510 --> 00:14:37,330
+But this we just
+calculated a moment ago,
+
+282
+00:14:37,330 --> 00:14:40,120
+so we can just
+replace it by this.
+
+283
+00:14:40,120 --> 00:14:43,255
+Ok. That's what we
+calculated earlier.
+
+284
+00:14:43,255 --> 00:14:46,665
+But now you will see
+again this factor,
+
+285
+00:14:46,665 --> 00:14:48,695
+and this factor is the same.
+
+286
+00:14:48,695 --> 00:14:52,700
+So I can pull that
+out to be like that.
+
+287
+00:14:52,700 --> 00:14:57,380
+And I have now followed
+by R plus R. Oh,
+
+288
+00:14:57,380 --> 00:14:59,030
+hey, man, now you probably
+
+289
+00:14:59,030 --> 00:15:00,680
+remember how we did it earlier.
+
+290
+00:15:00,680 --> 00:15:03,080
+We can now pull out one copy of
+
+291
+00:15:03,080 --> 00:15:06,020
+this are to just get
+something like this.
+
+292
+00:15:06,020 --> 00:15:08,765
+We have to add one essentially,
+
+293
+00:15:08,765 --> 00:15:11,615
+but we now get r plus one.
+
+294
+00:15:11,615 --> 00:15:15,065
+And this r here is
+now just pulled out.
+
+295
+00:15:15,065 --> 00:15:18,995
+Well, here again kicks
+in this reasoning.
+
+296
+00:15:18,995 --> 00:15:22,700
+We go in this if branch
+only if r is nullable,
+
+297
+00:15:22,700 --> 00:15:26,150
+so on its own can already
+match the empty string.
+
+298
+00:15:26,150 --> 00:15:28,895
+So I don't need
+really this plus one.
+
+299
+00:15:28,895 --> 00:15:30,695
+I can just get rid of it.
+
+300
+00:15:30,695 --> 00:15:33,140
+And so I now just have
+to rearrange the parent,
+
+301
+00:15:33,140 --> 00:15:35,450
+the thesis which we
+said we can also do.
+
+302
+00:15:35,450 --> 00:15:37,595
+So we have something like that.
+
+303
+00:15:37,595 --> 00:15:39,740
+And here again, the
+same reasoning,
+
+304
+00:15:39,740 --> 00:15:41,975
+we have a if condition
+
+305
+00:15:41,975 --> 00:15:43,310
+where it doesn't
+really matter what
+
+306
+00:15:43,310 --> 00:15:44,405
+we're going to return,
+
+307
+00:15:44,405 --> 00:15:46,205
+it's in both branches the same.
+
+308
+00:15:46,205 --> 00:15:48,470
+So we can just
+replace it by that.
+
+309
+00:15:48,470 --> 00:15:51,920
+And yes, now we can be
+pretty sure they'll
+
+310
+00:15:51,920 --> 00:15:55,310
+work for all the n times
+regular expressions.
+
+311
+00:15:55,310 --> 00:15:57,860
+And leave that to
+the calculation for
+
+312
+00:15:57,860 --> 00:16:02,570
+maybe R to the four to you.
+
+313
+00:16:02,570 --> 00:16:04,490
+And the reason why I do it
+
+314
+00:16:04,490 --> 00:16:06,605
+in such a detail,
+this calculation,
+
+315
+00:16:06,605 --> 00:16:08,960
+this is really meant
+to help you with
+
+316
+00:16:08,960 --> 00:16:13,200
+the coursework which is
+coming up this week.
+
+317
+00:16:13,210 --> 00:16:16,250
+I'm now back in our
+implementation.
+
+318
+00:16:16,250 --> 00:16:20,660
+In this Reto, said We have
+this explicit constructor now
+
+319
+00:16:20,660 --> 00:16:25,535
+for n times b can now fill
+in the cases for nullable.
+
+320
+00:16:25,535 --> 00:16:27,635
+So if we have R in times,
+
+321
+00:16:27,635 --> 00:16:30,995
+if this n is equal to
+0, we return true.
+
+322
+00:16:30,995 --> 00:16:34,190
+Otherwise we have to test
+whether eyes nullable.
+
+323
+00:16:34,190 --> 00:16:37,565
+And in the derivative case, oi,
+
+324
+00:16:37,565 --> 00:16:40,339
+if this n is equal to 0,
+
+325
+00:16:40,339 --> 00:16:43,564
+the return this 0
+wreck expression.
+
+326
+00:16:43,564 --> 00:16:46,700
+Otherwise we return the
+sequence of the derivative
+
+327
+00:16:46,700 --> 00:16:50,270
+of psi of r four by n times of r,
+
+328
+00:16:50,270 --> 00:16:54,545
+but n minus one times and
+everything else stays the same.
+
+329
+00:16:54,545 --> 00:16:56,510
+Here's the function for strings,
+
+330
+00:16:56,510 --> 00:16:58,430
+derivative function for strings.
+
+331
+00:16:58,430 --> 00:17:01,595
+In the main mantra
+function as all the same.
+
+332
+00:17:01,595 --> 00:17:04,820
+We still require this
+definition because
+
+333
+00:17:04,820 --> 00:17:06,050
+we haven't done anything about
+
+334
+00:17:06,050 --> 00:17:08,090
+the optional record
+expression yet.
+
+335
+00:17:08,090 --> 00:17:10,670
+And we have now at this
+
+336
+00:17:10,670 --> 00:17:13,250
+either warm and evil
+2-break expression.
+
+337
+00:17:13,250 --> 00:17:15,290
+And let's test it. And let's be
+
+338
+00:17:15,290 --> 00:17:17,000
+a bit more ambitious.
+Be testing it.
+
+339
+00:17:17,000 --> 00:17:20,315
+The strings between 01000
+
+340
+00:17:20,315 --> 00:17:22,580
+and let's see what the time say.
+
+341
+00:17:22,580 --> 00:17:26,210
+I'm testing this again
+inside the ammonite rebel.
+
+342
+00:17:26,210 --> 00:17:30,180
+And you'll see it should
+be now much quicker.
+
+343
+00:17:30,610 --> 00:17:34,640
+Okay, it might slow
+down also around 600.
+
+344
+00:17:34,640 --> 00:17:40,490
+700 needs two seconds,
+three seconds, 4800.
+
+345
+00:17:40,490 --> 00:17:43,940
+Let's see about it
+needs 4 thousand.
+
+346
+00:17:43,940 --> 00:17:47,435
+But you have to remember
+Ruby and Python.
+
+347
+00:17:47,435 --> 00:17:51,530
+They needed half a
+minute to just 43 TAs.
+
+348
+00:17:51,530 --> 00:17:54,485
+We need a little bit
+more than six seconds,
+
+349
+00:17:54,485 --> 00:17:57,110
+but we are processing a string of
+
+350
+00:17:57,110 --> 00:18:00,575
+1000 days so that success.
+
+351
+00:18:00,575 --> 00:18:04,775
+So this speed is also explained
+if you look at the sizes.
+
+352
+00:18:04,775 --> 00:18:08,975
+Since we now have this explicit
+and times constructor,
+
+353
+00:18:08,975 --> 00:18:11,930
+it doesn't really matter
+how big this n is.
+
+354
+00:18:11,930 --> 00:18:14,540
+This evil one reg
+expression will always be
+
+355
+00:18:14,540 --> 00:18:17,195
+of this size seven,
+the beginning.
+
+356
+00:18:17,195 --> 00:18:20,315
+And you can also see if you
+now build the derivatives,
+
+357
+00:18:20,315 --> 00:18:22,550
+they still grow in size,
+
+358
+00:18:22,550 --> 00:18:24,740
+but much more moderately.
+
+359
+00:18:24,740 --> 00:18:28,100
+And let's try out this
+example, this 20 a.
+
+360
+00:18:28,100 --> 00:18:31,685
+So we build the derivative
+according to 20 character A's.
+
+361
+00:18:31,685 --> 00:18:33,890
+In the earlier example,
+
+362
+00:18:33,890 --> 00:18:35,780
+we ended up this a
+wreck expression which
+
+363
+00:18:35,780 --> 00:18:37,895
+had like 8 million plus nodes.
+
+364
+00:18:37,895 --> 00:18:39,830
+And if we do this now,
+
+365
+00:18:39,830 --> 00:18:43,205
+then we just have a wreck
+expression with 211 nodes.
+
+366
+00:18:43,205 --> 00:18:44,750
+And that is much smaller and
+
+367
+00:18:44,750 --> 00:18:47,165
+all the calculations
+would be much quicker.
+
+368
+00:18:47,165 --> 00:18:49,550
+So yeah, there's
+
+369
+00:18:49,550 --> 00:18:52,205
+this push off CKY algorithm
+and this improvement.
+
+370
+00:18:52,205 --> 00:18:54,890
+We're now running
+circles around Ruby and
+
+371
+00:18:54,890 --> 00:18:58,445
+Python because they're just
+stuck here at the beginning.
+
+372
+00:18:58,445 --> 00:19:00,230
+Because they need already
+
+373
+00:19:00,230 --> 00:19:02,975
+like half a minute
+for just 30 a's.
+
+374
+00:19:02,975 --> 00:19:05,930
+We now can do something
+like thousand a's.
+
+375
+00:19:05,930 --> 00:19:07,580
+And in reasonable time.
+
+376
+00:19:07,580 --> 00:19:09,740
+I think this must be
+timing I obtained with
+
+377
+00:19:09,740 --> 00:19:12,635
+my older laptop some time ago.
+
+378
+00:19:12,635 --> 00:19:14,210
+Because remember we
+had something like
+
+379
+00:19:14,210 --> 00:19:16,670
+six seconds here, it says 15.
+
+380
+00:19:16,670 --> 00:19:18,080
+So you always have to take
+
+381
+00:19:18,080 --> 00:19:20,885
+these times with
+the pinch of salt.
+
+382
+00:19:20,885 --> 00:19:22,670
+It's essentially only the trend,
+
+383
+00:19:22,670 --> 00:19:25,100
+but it's clear we are
+much, much better.
+
+384
+00:19:25,100 --> 00:19:27,065
+So we have worked a lot,
+
+385
+00:19:27,065 --> 00:19:30,720
+but we also got something
+for it in return.