diff -r 34f77b976b88 -r f9686b22db7e videos/02-algo2.srt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/videos/02-algo2.srt Sat Oct 03 00:51:47 2020 +0100 @@ -0,0 +1,1772 @@ +1 +00:00:06,020 --> 00:00:09,945 +They come back after +this disappointment + +2 +00:00:09,945 --> 00:00:14,115 +and case of over promising +and under achieving. + +3 +00:00:14,115 --> 00:00:17,295 +Let's have a look whether +we can recover from that. + +4 +00:00:17,295 --> 00:00:20,535 +So here's one problem. + +5 +00:00:20,535 --> 00:00:23,790 +Then we looked at this end +times vk expression and + +6 +00:00:23,790 --> 00:00:27,330 +be able to represent +that directly. + +7 +00:00:27,330 --> 00:00:31,275 +We had two represented as a +sequence record expression. + +8 +00:00:31,275 --> 00:00:32,850 +So we expanded it. + +9 +00:00:32,850 --> 00:00:36,510 +So times one would be just a, + +10 +00:00:36,510 --> 00:00:40,595 +t, times two would +be a, and so on. + +11 +00:00:40,595 --> 00:00:43,535 +And so you can see if +you go already to 13, + +12 +00:00:43,535 --> 00:00:47,960 +N equals 13, the record +expression becomes quite large. + +13 +00:00:47,960 --> 00:00:52,865 +And now the functions +nullable and also derivative. + +14 +00:00:52,865 --> 00:00:56,360 +They need to traverse +this reg expression. + +15 +00:00:56,360 --> 00:00:59,060 +Remember this tree in sometimes + +16 +00:00:59,060 --> 00:01:01,820 +they have to traverse +that even several times. + +17 +00:01:01,820 --> 00:01:04,535 +So that will slow +down the algorithm. + +18 +00:01:04,535 --> 00:01:08,150 +And in particular, because +our first reg expression was + +19 +00:01:08,150 --> 00:01:11,840 +actually not just a bot +a plus one. So b hat. + +20 +00:01:11,840 --> 00:01:14,330 +In the case of 13, +we had a plus one, + +21 +00:01:14,330 --> 00:01:17,330 +a plus one a plus born 13 times. + +22 +00:01:17,330 --> 00:01:20,150 +And this reg +expression just grows, + +23 +00:01:20,150 --> 00:01:25,475 +stay in number n. Just to +show you this also encode, + +24 +00:01:25,475 --> 00:01:28,115 +I'm Stern, This File rewarm + +25 +00:01:28,115 --> 00:01:30,920 +and D and I have a size function. + +26 +00:01:30,920 --> 00:01:33,140 +The size function takes +a regular expression as + +27 +00:01:33,140 --> 00:01:36,215 +argument and produces in teacher. + +28 +00:01:36,215 --> 00:01:39,980 +And essentially it takes +this rec expression scene as + +29 +00:01:39,980 --> 00:01:44,045 +a tree and count how many +nodes are in this tree. + +30 +00:01:44,045 --> 00:01:49,490 +And so if I take this even +one reg expression, this one, + +31 +00:01:49,490 --> 00:01:52,160 +and increase now this n. So you + +32 +00:01:52,160 --> 00:01:54,680 +can already see +for any cross one, + +33 +00:01:54,680 --> 00:01:56,540 +the size of this +record expression is + +34 +00:01:56,540 --> 00:01:59,615 +five and you see it +slowly increases. + +35 +00:01:59,615 --> 00:02:02,150 +And this 20 n equals 20. + +36 +00:02:02,150 --> 00:02:07,100 +The size of this record +expression is SMOW already a 119. + +37 +00:02:07,100 --> 00:02:10,145 +So now the interesting +line as this one. + +38 +00:02:10,145 --> 00:02:13,295 +Remember it when we +built the derivative, + +39 +00:02:13,295 --> 00:02:17,150 +very often parts of a reg +expression gets copied. + +40 +00:02:17,150 --> 00:02:19,280 +So if you have like EVA, + +41 +00:02:19,280 --> 00:02:22,325 +one of 2019 nodes, + +42 +00:02:22,325 --> 00:02:25,475 +now this will be often copied. + +43 +00:02:25,475 --> 00:02:31,475 +And we want to see what's the +resulting tree look like, + +44 +00:02:31,475 --> 00:02:32,780 +how many nodes it has. + +45 +00:02:32,780 --> 00:02:34,985 +So I take here either one of 20 + +46 +00:02:34,985 --> 00:02:38,405 +and the derivative +according to 20 a's. + +47 +00:02:38,405 --> 00:02:41,820 +And just have a look +what the size is. + +48 +00:02:43,210 --> 00:02:45,680 +Okay, that takes away. + +49 +00:02:45,680 --> 00:02:48,410 +You see now this rec expression, + +50 +00:02:48,410 --> 00:02:52,280 +the derivative has already +8 million plus nodes. + +51 +00:02:52,280 --> 00:02:55,400 +And now nullable and +derivative have to + +52 +00:02:55,400 --> 00:02:58,430 +traverse these trees with +a million plus nodes. + +53 +00:02:58,430 --> 00:03:01,400 +So it's no wonder +that this is slow. + +54 +00:03:01,400 --> 00:03:03,860 +The first thing we +can try to mitigate + +55 +00:03:03,860 --> 00:03:06,350 +this explosion problem is to + +56 +00:03:06,350 --> 00:03:09,050 +have an explicit and +times reg expression. + +57 +00:03:09,050 --> 00:03:11,600 +So instead of expanding + +58 +00:03:11,600 --> 00:03:13,415 +it to the sequence +reg expression, + +59 +00:03:13,415 --> 00:03:17,510 +let's just have a single +wreck expression and times, + +60 +00:03:17,510 --> 00:03:20,540 +which takes an expression and + +61 +00:03:20,540 --> 00:03:24,470 +a number and keeps that +regular expression Small. + +62 +00:03:24,470 --> 00:03:27,095 +I'm here in the fire V2, + +63 +00:03:27,095 --> 00:03:29,090 +which is also on Keats. + +64 +00:03:29,090 --> 00:03:32,570 +And the only change I made +is I added explicitly + +65 +00:03:32,570 --> 00:03:33,860 +a wrecker expression for + +66 +00:03:33,860 --> 00:03:36,770 +N times the optional +reg expression. + +67 +00:03:36,770 --> 00:03:39,920 +Very leaf out at the +moment because this a + +68 +00:03:39,920 --> 00:03:41,975 +optional is represented as + +69 +00:03:41,975 --> 00:03:44,645 +a plus one and doesn't +explain too much. + +70 +00:03:44,645 --> 00:03:47,375 +The really the culprit +is this n times. + +71 +00:03:47,375 --> 00:03:51,095 +And instead of expanding +it n times, which has, + +72 +00:03:51,095 --> 00:03:54,275 +take here a wreck expression +and a natural number, + +73 +00:03:54,275 --> 00:03:56,960 +which says how many times +it should be repeated. + +74 +00:03:56,960 --> 00:03:59,165 +And in this week we +can keep it small. + +75 +00:03:59,165 --> 00:04:01,370 +But by adding that +reg expression, + +76 +00:04:01,370 --> 00:04:05,150 +we now have to think about +cases for nullable and + +77 +00:04:05,150 --> 00:04:07,340 +derivative so that we have + +78 +00:04:07,340 --> 00:04:10,205 +to now calculate next +what they look like. + +79 +00:04:10,205 --> 00:04:14,060 +We can actually +calculate what the rule + +80 +00:04:14,060 --> 00:04:17,525 +for nullable should be from +how we defined it earlier. + +81 +00:04:17,525 --> 00:04:19,580 +Remember, if you have +a regular expression, + +82 +00:04:19,580 --> 00:04:21,980 +R and B take in +times of this are, + +83 +00:04:21,980 --> 00:04:25,220 +then indicates if n equals 0, + +84 +00:04:25,220 --> 00:04:28,130 +we find that as the +one reg expression. + +85 +00:04:28,130 --> 00:04:30,380 +If n equals one, + +86 +00:04:30,380 --> 00:04:32,495 +it will be just a +single copy of on. + +87 +00:04:32,495 --> 00:04:33,725 +If n equals two, + +88 +00:04:33,725 --> 00:04:35,270 +it will be two copies. + +89 +00:04:35,270 --> 00:04:39,260 +Sequence if three, we have +three copies and so on. + +90 +00:04:39,260 --> 00:04:41,390 +Now we have to +somehow characterize + +91 +00:04:41,390 --> 00:04:44,285 +when these cases all nullable. + +92 +00:04:44,285 --> 00:04:47,810 +Well, if n equals 0, + +93 +00:04:47,810 --> 00:04:49,850 +then this will be defined as one, + +94 +00:04:49,850 --> 00:04:52,100 +so one can match +the empty string. + +95 +00:04:52,100 --> 00:04:54,785 +So that should be +definitely true. + +96 +00:04:54,785 --> 00:04:57,725 +Okay, if any cross one, + +97 +00:04:57,725 --> 00:05:00,470 +when can this reg expression +match the empty string? + +98 +00:05:00,470 --> 00:05:02,000 +So vent should be nullable. + +99 +00:05:02,000 --> 00:05:07,535 +Well, if nullable of our holes, + +100 +00:05:07,535 --> 00:05:09,860 +okay, so if I can match +the empty string, + +101 +00:05:09,860 --> 00:05:12,110 +then we can match +the empty string. + +102 +00:05:12,110 --> 00:05:14,870 +When can this regular expression +match the empty string? + +103 +00:05:14,870 --> 00:05:16,265 +So these are now two copies of + +104 +00:05:16,265 --> 00:05:20,690 +our well-posed copies need +to match the empty string. + +105 +00:05:20,690 --> 00:05:22,820 +So again, we have to ask + +106 +00:05:22,820 --> 00:05:25,774 +both of them need to be nullable. + +107 +00:05:25,774 --> 00:05:28,520 +Ok? Similarly, in the third case, + +108 +00:05:28,520 --> 00:05:30,710 +all three copies need to be + +109 +00:05:30,710 --> 00:05:33,230 +able to match the empty +string. Men, is that the case? + +110 +00:05:33,230 --> 00:05:38,360 +Well, if nullable of +our holes and so on. + +111 +00:05:38,360 --> 00:05:48,754 +So we can actually define that +if n equals 0, then true. + +112 +00:05:48,754 --> 00:05:56,045 +Else we have to ask with +nullable of our holes. + +113 +00:05:56,045 --> 00:05:58,550 +So that will be the clause we + +114 +00:05:58,550 --> 00:06:01,625 +have to implement for +n times and nullable. + +115 +00:06:01,625 --> 00:06:04,280 +Deriving the definition for + +116 +00:06:04,280 --> 00:06:06,920 +the derivative of the n terms + +117 +00:06:06,920 --> 00:06:10,175 +reg expression. +It's not that easy. + +118 +00:06:10,175 --> 00:06:12,755 +So we have to look again +how it was defined + +119 +00:06:12,755 --> 00:06:16,010 +earlier and we somehow +have to spot a pattern. + +120 +00:06:16,010 --> 00:06:18,380 +The voice, our +algorithm will be again + +121 +00:06:18,380 --> 00:06:20,975 +quite slow if you don't +spot that pattern. + +122 +00:06:20,975 --> 00:06:22,550 +So let's have a look. + +123 +00:06:22,550 --> 00:06:26,240 +So in the case that +n is equal to 0, + +124 +00:06:26,240 --> 00:06:29,885 +then r n times most +defined as just one. + +125 +00:06:29,885 --> 00:06:32,030 +What would be the +derivative of one? + +126 +00:06:32,030 --> 00:06:36,140 +So the derivative of c of one. + +127 +00:06:36,140 --> 00:06:38,990 +Peaches defined as 0. + +128 +00:06:38,990 --> 00:06:41,465 +Okay, fair enough. + +129 +00:06:41,465 --> 00:06:44,960 +If he have any cross one, + +130 +00:06:44,960 --> 00:06:48,125 +then we just have one copy +of this regular expression. + +131 +00:06:48,125 --> 00:06:50,120 +So there's not much as we can do. + +132 +00:06:50,120 --> 00:06:53,735 +We would have to calculate +the derivative of air are. + +133 +00:06:53,735 --> 00:06:57,395 +Now in the n equals two case. + +134 +00:06:57,395 --> 00:07:00,245 +That would mean we +have two copies of + +135 +00:07:00,245 --> 00:07:03,425 +R. And they would be a sequence. + +136 +00:07:03,425 --> 00:07:07,775 +How would be the derivative C of + +137 +00:07:07,775 --> 00:07:10,340 +R four by R be + +138 +00:07:10,340 --> 00:07:13,265 +defined where we would +look up the definition. + +139 +00:07:13,265 --> 00:07:15,470 +And it would say that's +the complicated case + +140 +00:07:15,470 --> 00:07:16,760 +d sequence or + +141 +00:07:16,760 --> 00:07:21,875 +if null a bowl of R, + +142 +00:07:21,875 --> 00:07:25,250 +Then the most complicated case. + +143 +00:07:25,250 --> 00:07:28,225 +Else, That's the easy +case that would be + +144 +00:07:28,225 --> 00:07:32,660 +derivative of c of R four + +145 +00:07:32,660 --> 00:07:35,540 +by R. And then I +just have to copy + +146 +00:07:35,540 --> 00:07:40,775 +that derivative of C +of four by r here. + +147 +00:07:40,775 --> 00:07:43,130 +But in this case I have also + +148 +00:07:43,130 --> 00:07:51,145 +the alternative derivative +of c of r. And from that, + +149 +00:07:51,145 --> 00:07:55,030 +it looks like we can +do much better than + +150 +00:07:55,030 --> 00:07:58,390 +that unless we do + +151 +00:07:58,390 --> 00:08:02,560 +a little bit of magic and the +magic to do with this case. + +152 +00:08:02,560 --> 00:08:07,420 +So let me look at this +case a bit more closely. + +153 +00:08:07,420 --> 00:08:09,819 +Let me go back to slides + +154 +00:08:09,819 --> 00:08:12,940 +because actually the calculation +might be a bit hairy. + +155 +00:08:12,940 --> 00:08:16,945 +So remember we are in this +case where n equals two. + +156 +00:08:16,945 --> 00:08:18,550 +And this was defined as + +157 +00:08:18,550 --> 00:08:20,680 +this sequence work +expression R followed + +158 +00:08:20,680 --> 00:08:23,080 +by r. And the question was, + +159 +00:08:23,080 --> 00:08:26,365 +can we somehow make sense +out of this derivative + +160 +00:08:26,365 --> 00:08:30,370 +where we have to build the +derivative of a sequence. + +161 +00:08:30,370 --> 00:08:33,020 +So features the definition. + +162 +00:08:33,020 --> 00:08:36,050 +We would ask if +the r is nullable, + +163 +00:08:36,050 --> 00:08:39,095 +In this case, we return +this alternative. + +164 +00:08:39,095 --> 00:08:40,640 +And if it's not nullable, + +165 +00:08:40,640 --> 00:08:44,135 +It is just this +record expression. + +166 +00:08:44,135 --> 00:08:49,550 +Now my claim is that +this whole if condition + +167 +00:08:49,550 --> 00:08:55,895 +can be replaced by just this +simple derivative here. + +168 +00:08:55,895 --> 00:08:58,775 +And if that claim is correct, + +169 +00:08:58,775 --> 00:09:01,520 +there would be very nice +because in contrast to + +170 +00:09:01,520 --> 00:09:04,130 +this if condition where + +171 +00:09:04,130 --> 00:09:06,280 +we have to calculate +like nullable, + +172 +00:09:06,280 --> 00:09:08,330 +decide in which branch we are. + +173 +00:09:08,330 --> 00:09:10,580 +We don't have to +calculate your now, + +174 +00:09:10,580 --> 00:09:13,880 +but we can just replace +it by this expression. + +175 +00:09:13,880 --> 00:09:16,670 +So if we can +substantiate that claim, + +176 +00:09:16,670 --> 00:09:19,860 +that will be definitely +good file algorithm. + +177 +00:09:20,140 --> 00:09:22,775 +And to substantiate that, + +178 +00:09:22,775 --> 00:09:26,795 +I will focus on this +record expression here. + +179 +00:09:26,795 --> 00:09:31,100 +And notice that this record +expression the only be + +180 +00:09:31,100 --> 00:09:35,780 +called or only be generated +if r is nullable. + +181 +00:09:35,780 --> 00:09:38,075 +So in any other case, + +182 +00:09:38,075 --> 00:09:40,060 +I will actually not go into it + +183 +00:09:40,060 --> 00:09:43,850 +that if branch and would +be in the other one. + +184 +00:09:43,850 --> 00:09:45,260 +So if we are in this if branch, + +185 +00:09:45,260 --> 00:09:47,705 +we definitely know +that R is nullable. + +186 +00:09:47,705 --> 00:09:52,955 +Okay? Okay, so here's +our regular expression. + +187 +00:09:52,955 --> 00:09:55,940 +And we know it's nullable. + +188 +00:09:55,940 --> 00:09:57,920 +So we have to somehow find + +189 +00:09:57,920 --> 00:10:00,380 +an equivalent expression that + +190 +00:10:00,380 --> 00:10:04,100 +is smaller or simpler +than that one. + +191 +00:10:04,100 --> 00:10:05,945 +Let's see what we can do. + +192 +00:10:05,945 --> 00:10:10,160 +So the first thing +actually is we multiplying + +193 +00:10:10,160 --> 00:10:16,595 +this right hand side of the +alternative is times one. + +194 +00:10:16,595 --> 00:10:19,700 +We can do that +because this does not + +195 +00:10:19,700 --> 00:10:23,090 +change which strings this +work expression can match. + +196 +00:10:23,090 --> 00:10:25,685 +Remember we even had it +as a simplification row, + +197 +00:10:25,685 --> 00:10:27,425 +just in this case B, + +198 +00:10:27,425 --> 00:10:29,705 +don't apply it as a +simplification will + +199 +00:10:29,705 --> 00:10:31,310 +actually make this +work expression + +200 +00:10:31,310 --> 00:10:32,720 +a bit more complicated. + +201 +00:10:32,720 --> 00:10:34,910 +But times one doesn't make + +202 +00:10:34,910 --> 00:10:37,820 +a difference because it +means the end of the string, + +203 +00:10:37,820 --> 00:10:40,070 +we still want to match +the empty string. + +204 +00:10:40,070 --> 00:10:42,155 +Okay, so that is possible. + +205 +00:10:42,155 --> 00:10:45,740 +I can do that. Once +we have done that, + +206 +00:10:45,740 --> 00:10:48,410 +you will notice that this +factor derivative of + +207 +00:10:48,410 --> 00:10:51,860 +stuff are exactly the +same as that one. + +208 +00:10:51,860 --> 00:10:54,650 +And in between is a plus. + +209 +00:10:54,650 --> 00:10:57,440 +So you probably remember the law + +210 +00:10:57,440 --> 00:11:00,170 +from school math +that I can pull out + +211 +00:11:00,170 --> 00:11:02,735 +this factor derivative of c of r. + +212 +00:11:02,735 --> 00:11:06,320 +And I'm inside the parentheses +is left with that. + +213 +00:11:06,320 --> 00:11:09,245 +So now I have r plus one. + +214 +00:11:09,245 --> 00:11:13,055 +Usually we cannot +simplify r plus one. + +215 +00:11:13,055 --> 00:11:15,530 +If it had been R +plus 0, then yes, + +216 +00:11:15,530 --> 00:11:18,665 +we could have got rid of the CRO. + +217 +00:11:18,665 --> 00:11:21,590 +Plus one often +makes a difference, + +218 +00:11:21,590 --> 00:11:22,970 +but not in our case. + +219 +00:11:22,970 --> 00:11:25,940 +Remember, we know that +this R is nullable, + +220 +00:11:25,940 --> 00:11:29,840 +so on its own can already +match the empty string. + +221 +00:11:29,840 --> 00:11:33,305 +So we don't really need this +alternative plus one there, + +222 +00:11:33,305 --> 00:11:35,300 +so we can just get rid of that. + +223 +00:11:35,300 --> 00:11:37,070 +Okay, and so now we have + +224 +00:11:37,070 --> 00:11:40,535 +a much simpler wound +reg expression. + +225 +00:11:40,535 --> 00:11:44,285 +And this actually helps a +lot for our if condition. + +226 +00:11:44,285 --> 00:11:46,925 +Look, this was the +original if condition + +227 +00:11:46,925 --> 00:11:50,270 +and this is direct expression +h. We just simplified. + +228 +00:11:50,270 --> 00:11:53,105 +If we replace it with this one, + +229 +00:11:53,105 --> 00:11:56,090 +then we just end up with this. + +230 +00:11:56,090 --> 00:11:59,510 +And now you will see that +the if condition is actually + +231 +00:11:59,510 --> 00:12:02,750 +pointless because you +check if it's null above, + +232 +00:12:02,750 --> 00:12:05,060 +we return this reg +expression or it's + +233 +00:12:05,060 --> 00:12:08,210 +not nullable and we +return exactly the same. + +234 +00:12:08,210 --> 00:12:10,025 +That doesn't make any difference. + +235 +00:12:10,025 --> 00:12:11,750 +So we can just get rid of + +236 +00:12:11,750 --> 00:12:14,645 +that one and can +replace it by that. + +237 +00:12:14,645 --> 00:12:16,865 +And you see, this is now + +238 +00:12:16,865 --> 00:12:20,720 +a much simpler case than +what we had before. + +239 +00:12:20,720 --> 00:12:24,170 +So let's take stock +what we have so far. + +240 +00:12:24,170 --> 00:12:26,915 +So we know India CEO case, + +241 +00:12:26,915 --> 00:12:30,920 +the derivative needs +to be defined as 0. + +242 +00:12:30,920 --> 00:12:33,095 +So because we define this + +243 +00:12:33,095 --> 00:12:36,785 +and times as one, +the derivative is 0. + +244 +00:12:36,785 --> 00:12:39,889 +For chest r, this will +be the derivative. + +245 +00:12:39,889 --> 00:12:42,170 +And we can't do any +better than that + +246 +00:12:42,170 --> 00:12:45,620 +for our followed by +RB just found out. + +247 +00:12:45,620 --> 00:12:47,270 +Actually it is quite simple. + +248 +00:12:47,270 --> 00:12:51,410 +Reg expression is equivalent +to the derivative. + +249 +00:12:51,410 --> 00:12:53,870 +Now, we have to continue with + +250 +00:12:53,870 --> 00:12:56,090 +that case where n is +equal to three and we + +251 +00:12:56,090 --> 00:12:58,099 +now have three copies + +252 +00:12:58,099 --> 00:13:02,390 +of this or what should +be the derivative? + +253 +00:13:02,390 --> 00:13:05,330 +Well, if you look very carefully + +254 +00:13:05,330 --> 00:13:08,465 +at this emerging pattern, + +255 +00:13:08,465 --> 00:13:12,410 +I have to say then +what would be nice if, + +256 +00:13:12,410 --> 00:13:16,400 +if he could show that in +the n equals three case, + +257 +00:13:16,400 --> 00:13:18,275 +we end up with this. + +258 +00:13:18,275 --> 00:13:21,290 +Because then what we have is. + +259 +00:13:21,290 --> 00:13:25,370 +We can define our +nullable for n times + +260 +00:13:25,370 --> 00:13:31,025 +s. If any cross 0 then +true as nullable. + +261 +00:13:31,025 --> 00:13:33,875 +And for the derivative, + +262 +00:13:33,875 --> 00:13:37,190 +we can save if n is equal to 0, + +263 +00:13:37,190 --> 00:13:40,235 +then we return the +Sierra reg expression. + +264 +00:13:40,235 --> 00:13:43,295 +Otherwise, as you can see +from this pattern here, + +265 +00:13:43,295 --> 00:13:50,735 +it would be derivative of +c r four by r n minus one. + +266 +00:13:50,735 --> 00:13:54,770 +Okay? And the only + +267 +00:13:54,770 --> 00:13:56,330 +thing we have to make choice that + +268 +00:13:56,330 --> 00:13:58,175 +this pattern actually holds. + +269 +00:13:58,175 --> 00:14:00,470 +So that's, I will +show you next in + +270 +00:14:00,470 --> 00:14:03,735 +the case for n equals three. + +271 +00:14:03,735 --> 00:14:07,810 +If we have a wreck expression R + +272 +00:14:07,810 --> 00:14:09,820 +and it's followed +by r and another r, + +273 +00:14:09,820 --> 00:14:11,275 +three copies of it. + +274 +00:14:11,275 --> 00:14:14,245 +We can just unfold +again the definition. + +275 +00:14:14,245 --> 00:14:16,930 +So we would ask if I is nullable, + +276 +00:14:16,930 --> 00:14:19,765 +then we have this if branch. + +277 +00:14:19,765 --> 00:14:23,905 +And if i is not nullable +or we have this as branch. + +278 +00:14:23,905 --> 00:14:27,010 +Okay? What can we do here? + +279 +00:14:27,010 --> 00:14:30,310 +Well, we notice that +this one is just now + +280 +00:14:30,310 --> 00:14:34,510 +the derivative of two +Rs, one after another. + +281 +00:14:34,510 --> 00:14:37,330 +But this we just +calculated a moment ago, + +282 +00:14:37,330 --> 00:14:40,120 +so we can just +replace it by this. + +283 +00:14:40,120 --> 00:14:43,255 +Ok. That's what we +calculated earlier. + +284 +00:14:43,255 --> 00:14:46,665 +But now you will see +again this factor, + +285 +00:14:46,665 --> 00:14:48,695 +and this factor is the same. + +286 +00:14:48,695 --> 00:14:52,700 +So I can pull that +out to be like that. + +287 +00:14:52,700 --> 00:14:57,380 +And I have now followed +by R plus R. Oh, + +288 +00:14:57,380 --> 00:14:59,030 +hey, man, now you probably + +289 +00:14:59,030 --> 00:15:00,680 +remember how we did it earlier. + +290 +00:15:00,680 --> 00:15:03,080 +We can now pull out one copy of + +291 +00:15:03,080 --> 00:15:06,020 +this are to just get +something like this. + +292 +00:15:06,020 --> 00:15:08,765 +We have to add one essentially, + +293 +00:15:08,765 --> 00:15:11,615 +but we now get r plus one. + +294 +00:15:11,615 --> 00:15:15,065 +And this r here is +now just pulled out. + +295 +00:15:15,065 --> 00:15:18,995 +Well, here again kicks +in this reasoning. + +296 +00:15:18,995 --> 00:15:22,700 +We go in this if branch +only if r is nullable, + +297 +00:15:22,700 --> 00:15:26,150 +so on its own can already +match the empty string. + +298 +00:15:26,150 --> 00:15:28,895 +So I don't need +really this plus one. + +299 +00:15:28,895 --> 00:15:30,695 +I can just get rid of it. + +300 +00:15:30,695 --> 00:15:33,140 +And so I now just have +to rearrange the parent, + +301 +00:15:33,140 --> 00:15:35,450 +the thesis which we +said we can also do. + +302 +00:15:35,450 --> 00:15:37,595 +So we have something like that. + +303 +00:15:37,595 --> 00:15:39,740 +And here again, the +same reasoning, + +304 +00:15:39,740 --> 00:15:41,975 +we have a if condition + +305 +00:15:41,975 --> 00:15:43,310 +where it doesn't +really matter what + +306 +00:15:43,310 --> 00:15:44,405 +we're going to return, + +307 +00:15:44,405 --> 00:15:46,205 +it's in both branches the same. + +308 +00:15:46,205 --> 00:15:48,470 +So we can just +replace it by that. + +309 +00:15:48,470 --> 00:15:51,920 +And yes, now we can be +pretty sure they'll + +310 +00:15:51,920 --> 00:15:55,310 +work for all the n times +regular expressions. + +311 +00:15:55,310 --> 00:15:57,860 +And leave that to +the calculation for + +312 +00:15:57,860 --> 00:16:02,570 +maybe R to the four to you. + +313 +00:16:02,570 --> 00:16:04,490 +And the reason why I do it + +314 +00:16:04,490 --> 00:16:06,605 +in such a detail, +this calculation, + +315 +00:16:06,605 --> 00:16:08,960 +this is really meant +to help you with + +316 +00:16:08,960 --> 00:16:13,200 +the coursework which is +coming up this week. + +317 +00:16:13,210 --> 00:16:16,250 +I'm now back in our +implementation. + +318 +00:16:16,250 --> 00:16:20,660 +In this Reto, said We have +this explicit constructor now + +319 +00:16:20,660 --> 00:16:25,535 +for n times b can now fill +in the cases for nullable. + +320 +00:16:25,535 --> 00:16:27,635 +So if we have R in times, + +321 +00:16:27,635 --> 00:16:30,995 +if this n is equal to +0, we return true. + +322 +00:16:30,995 --> 00:16:34,190 +Otherwise we have to test +whether eyes nullable. + +323 +00:16:34,190 --> 00:16:37,565 +And in the derivative case, oi, + +324 +00:16:37,565 --> 00:16:40,339 +if this n is equal to 0, + +325 +00:16:40,339 --> 00:16:43,564 +the return this 0 +wreck expression. + +326 +00:16:43,564 --> 00:16:46,700 +Otherwise we return the +sequence of the derivative + +327 +00:16:46,700 --> 00:16:50,270 +of psi of r four by n times of r, + +328 +00:16:50,270 --> 00:16:54,545 +but n minus one times and +everything else stays the same. + +329 +00:16:54,545 --> 00:16:56,510 +Here's the function for strings, + +330 +00:16:56,510 --> 00:16:58,430 +derivative function for strings. + +331 +00:16:58,430 --> 00:17:01,595 +In the main mantra +function as all the same. + +332 +00:17:01,595 --> 00:17:04,820 +We still require this +definition because + +333 +00:17:04,820 --> 00:17:06,050 +we haven't done anything about + +334 +00:17:06,050 --> 00:17:08,090 +the optional record +expression yet. + +335 +00:17:08,090 --> 00:17:10,670 +And we have now at this + +336 +00:17:10,670 --> 00:17:13,250 +either warm and evil +2-break expression. + +337 +00:17:13,250 --> 00:17:15,290 +And let's test it. And let's be + +338 +00:17:15,290 --> 00:17:17,000 +a bit more ambitious. +Be testing it. + +339 +00:17:17,000 --> 00:17:20,315 +The strings between 01000 + +340 +00:17:20,315 --> 00:17:22,580 +and let's see what the time say. + +341 +00:17:22,580 --> 00:17:26,210 +I'm testing this again +inside the ammonite rebel. + +342 +00:17:26,210 --> 00:17:30,180 +And you'll see it should +be now much quicker. + +343 +00:17:30,610 --> 00:17:34,640 +Okay, it might slow +down also around 600. + +344 +00:17:34,640 --> 00:17:40,490 +700 needs two seconds, +three seconds, 4800. + +345 +00:17:40,490 --> 00:17:43,940 +Let's see about it +needs 4 thousand. + +346 +00:17:43,940 --> 00:17:47,435 +But you have to remember +Ruby and Python. + +347 +00:17:47,435 --> 00:17:51,530 +They needed half a +minute to just 43 TAs. + +348 +00:17:51,530 --> 00:17:54,485 +We need a little bit +more than six seconds, + +349 +00:17:54,485 --> 00:17:57,110 +but we are processing a string of + +350 +00:17:57,110 --> 00:18:00,575 +1000 days so that success. + +351 +00:18:00,575 --> 00:18:04,775 +So this speed is also explained +if you look at the sizes. + +352 +00:18:04,775 --> 00:18:08,975 +Since we now have this explicit +and times constructor, + +353 +00:18:08,975 --> 00:18:11,930 +it doesn't really matter +how big this n is. + +354 +00:18:11,930 --> 00:18:14,540 +This evil one reg +expression will always be + +355 +00:18:14,540 --> 00:18:17,195 +of this size seven, +the beginning. + +356 +00:18:17,195 --> 00:18:20,315 +And you can also see if you +now build the derivatives, + +357 +00:18:20,315 --> 00:18:22,550 +they still grow in size, + +358 +00:18:22,550 --> 00:18:24,740 +but much more moderately. + +359 +00:18:24,740 --> 00:18:28,100 +And let's try out this +example, this 20 a. + +360 +00:18:28,100 --> 00:18:31,685 +So we build the derivative +according to 20 character A's. + +361 +00:18:31,685 --> 00:18:33,890 +In the earlier example, + +362 +00:18:33,890 --> 00:18:35,780 +we ended up this a +wreck expression which + +363 +00:18:35,780 --> 00:18:37,895 +had like 8 million plus nodes. + +364 +00:18:37,895 --> 00:18:39,830 +And if we do this now, + +365 +00:18:39,830 --> 00:18:43,205 +then we just have a wreck +expression with 211 nodes. + +366 +00:18:43,205 --> 00:18:44,750 +And that is much smaller and + +367 +00:18:44,750 --> 00:18:47,165 +all the calculations +would be much quicker. + +368 +00:18:47,165 --> 00:18:49,550 +So yeah, there's + +369 +00:18:49,550 --> 00:18:52,205 +this push off CKY algorithm +and this improvement. + +370 +00:18:52,205 --> 00:18:54,890 +We're now running +circles around Ruby and + +371 +00:18:54,890 --> 00:18:58,445 +Python because they're just +stuck here at the beginning. + +372 +00:18:58,445 --> 00:19:00,230 +Because they need already + +373 +00:19:00,230 --> 00:19:02,975 +like half a minute +for just 30 a's. + +374 +00:19:02,975 --> 00:19:05,930 +We now can do something +like thousand a's. + +375 +00:19:05,930 --> 00:19:07,580 +And in reasonable time. + +376 +00:19:07,580 --> 00:19:09,740 +I think this must be +timing I obtained with + +377 +00:19:09,740 --> 00:19:12,635 +my older laptop some time ago. + +378 +00:19:12,635 --> 00:19:14,210 +Because remember we +had something like + +379 +00:19:14,210 --> 00:19:16,670 +six seconds here, it says 15. + +380 +00:19:16,670 --> 00:19:18,080 +So you always have to take + +381 +00:19:18,080 --> 00:19:20,885 +these times with +the pinch of salt. + +382 +00:19:20,885 --> 00:19:22,670 +It's essentially only the trend, + +383 +00:19:22,670 --> 00:19:25,100 +but it's clear we are +much, much better. + +384 +00:19:25,100 --> 00:19:27,065 +So we have worked a lot, + +385 +00:19:27,065 --> 00:19:30,720 +but we also got something +for it in return.