diff -r 260e330f7466 -r 42733adf2a48 videos/02-algo3.srt --- a/videos/02-algo3.srt Mon Oct 05 01:42:47 2020 +0100 +++ b/videos/02-algo3.srt Mon Oct 05 17:46:12 2020 +0100 @@ -1,6 +1,6 @@ 1 00:00:06,350 --> 00:00:10,305 -They come back. I +Welcome come back. I can hear you saying, 2 @@ -19,7 +19,7 @@ 5 00:00:17,415 --> 00:00:21,265 Specifically, we had two -regular expressions. +evil regular expressions. 6 00:00:21,265 --> 00:00:23,480 @@ -27,13 +27,13 @@ 7 00:00:23,480 --> 00:00:27,480 -Where let's have a look at +Well, let's have a look at that other case in this video. 8 00:00:29,140 --> 00:00:32,705 So I'm back here -in this Reto file. +in this re2.sc file. 9 00:00:32,705 --> 00:00:35,180 @@ -42,7 +42,7 @@ 10 00:00:35,180 --> 00:00:39,665 -nicely for strings between 01000. +nicely for strings between 0 and 1000. 11 00:00:39,665 --> 00:00:42,725 @@ -55,11 +55,11 @@ 13 00:00:44,090 --> 00:00:48,470 for relatively small -strings between 020. +strings between 0 and 20. 14 00:00:48,470 --> 00:00:50,240 -And let's see what it say. +And let's see what it says. 15 00:00:50,240 --> 00:00:51,800 @@ -67,13 +67,13 @@ 16 00:00:51,800 --> 00:00:57,320 -the ammonoids repo and +the Amoonite REPL and it doesn't look too bad. 17 00:00:57,320 --> 00:01:01,160 But this might be because -20 is not general enough. +20 is not generous enough. 18 00:01:01,160 --> 00:01:03,620 @@ -120,12 +120,12 @@ 28 00:01:33,905 --> 00:01:37,640 -which is Daphne bad and we +which is definitely bad and we have to have a look at that. 29 00:01:37,640 --> 00:01:40,640 -Okay? It's always +It's always instructive with 30 @@ -135,13 +135,13 @@ 31 00:01:43,460 --> 00:01:45,695 -the record expressions -we calculate +the regular expressions +we calculate. 32 00:01:45,695 --> 00:01:49,625 -the evil to Is this -a star, star B. +The EVIL2 is this +a star, star b. 33 00:01:49,625 --> 00:01:51,800 @@ -174,15 +174,15 @@ 39 00:02:05,600 --> 00:02:08,420 -If I take this evil to wreck +If I take this EVIL2 regular 40 00:02:08,420 --> 00:02:09,935 -expression and they built +expression and then build 41 00:02:09,935 --> 00:02:12,470 -now longer and +longer and longer derivatives, 42 @@ -209,7 +209,7 @@ 47 00:02:21,680 --> 00:02:26,870 But the size of the -reg expression, +regular expression, 48 00:02:26,870 --> 00:02:28,310 @@ -235,7 +235,7 @@ 53 00:02:39,560 --> 00:02:42,785 -This reg expression +This regular expression just grew too much. 54 @@ -245,13 +245,13 @@ 55 00:02:46,520 --> 00:02:49,655 -this record expression +this regular expression and make it not 56 00:02:49,655 --> 00:02:52,700 grow so much when we -build derivative. +build the derivative. 57 00:02:52,700 --> 00:02:54,020 @@ -279,7 +279,7 @@ 62 00:03:03,740 --> 00:03:06,560 this example of a -wreck expression R, +regulat expression r, 63 00:03:06,560 --> 00:03:11,465 @@ -312,15 +312,15 @@ 69 00:03:26,570 --> 00:03:29,015 which is this whole -wreck expression again. +regular expression again. 70 00:03:29,015 --> 00:03:30,935 -So you can also see. +So you can also see 71 00:03:30,935 --> 00:03:34,745 -Derivative operation +the derivative operation introduces a lot of junk. 72 @@ -330,7 +330,7 @@ 73 00:03:38,165 --> 00:03:42,455 -And this times one could +And this times 1 could be also thrown away. 74 @@ -344,16 +344,16 @@ 76 00:03:48,110 --> 00:03:49,580 -it still be a, +it still b, 77 00:03:49,580 --> 00:03:54,905 -so it would be B followed by -R. And in this second case, +so it would be b followed by +r. And in this second case, 78 00:03:54,905 --> 00:03:57,515 -C0 times b, that will be just 0. +0 times b, that will be just 0. 79 00:03:57,515 --> 00:03:59,270 @@ -361,13 +361,13 @@ 80 00:03:59,270 --> 00:04:05,330 -11 times r would be just +1 ... times r would be just r. And in the last case, 81 00:04:05,330 --> 00:04:12,155 -C0 times B would be 00 plus -0 is 00 times r would be 0. +0 times b would be 0. 0 plus +0 is 0. 0 times r would be 0. 82 00:04:12,155 --> 00:04:17,540 @@ -391,8 +391,8 @@ 86 00:04:27,035 --> 00:04:31,130 -0 wouldn't go away by -building annuity where tests. +And the 0s wouldn't go away by +building a new derivative. 87 00:04:31,130 --> 00:04:34,310 @@ -435,7 +435,7 @@ 95 00:04:58,715 --> 00:05:01,415 this might have -introduced some chunk. +introduced some junk. 96 00:05:01,415 --> 00:05:02,960 @@ -443,7 +443,7 @@ 97 00:05:02,960 --> 00:05:06,590 -essentially a additional function +essentially an additional function 98 00:05:06,590 --> 00:05:08,945 @@ -455,15 +455,15 @@ 100 00:05:10,490 --> 00:05:13,340 -keep steer, Rekha expression, +keeps the regular expression, 101 00:05:13,340 --> 00:05:16,730 -relatively small, pickers debts, +relatively small, because that 102 00:05:16,730 --> 00:05:19,535 -the Achilles heel +is the Achilles' heel of this algorithm. 103 @@ -488,7 +488,7 @@ 107 00:05:30,455 --> 00:05:33,080 compress again this -rec expression +regular expression 108 00:05:33,080 --> 00:05:35,570 @@ -506,7 +506,7 @@ 111 00:05:42,440 --> 00:05:46,040 -this reg expression and +this regular expression and deletes all this junk, 112 @@ -539,7 +539,7 @@ 118 00:06:02,720 --> 00:06:05,060 -The only simplify when we have +We only simplify when we have 119 00:06:05,060 --> 00:06:08,255 @@ -585,16 +585,16 @@ 128 00:06:30,530 --> 00:06:34,880 Now, however, there -is one small point. +is one small point 129 00:06:34,880 --> 00:06:39,785 -You have to be aware of -these simplification rules. +you have to be aware of. +These simplification rules 130 00:06:39,785 --> 00:06:42,740 -They need to be essentially +they need to be essentially applied backwards. 131 @@ -604,12 +604,12 @@ 132 00:06:45,650 --> 00:06:49,085 -this record expression and +this regular expression and then have to find out, 133 00:06:49,085 --> 00:06:51,170 -can I apply simplification? +can I apply the simplification? 134 00:06:51,170 --> 00:06:52,670 @@ -655,26 +655,26 @@ 143 00:07:16,850 --> 00:07:20,885 -what that means is the -matching this reg expression. +what that means we're +matching this regular expression. 144 00:07:20,885 --> 00:07:23,120 -Be test whether it's +We test whether it's an alternative or 145 00:07:23,120 --> 00:07:26,345 -a sequence only then we +a sequence. Only then we actually go into action. 146 00:07:26,345 --> 00:07:28,385 -Once we know. +Once we know 147 00:07:28,385 --> 00:07:30,530 -In case of an alternative, +in case of an alternative, 148 00:07:30,530 --> 00:07:32,120 @@ -682,7 +682,7 @@ 149 00:07:32,120 --> 00:07:35,765 -P first, simplify each component, +We first, simplify each component, 150 00:07:35,765 --> 00:07:39,095 @@ -696,7 +696,7 @@ 152 00:07:41,180 --> 00:07:45,679 this simplification of -R1 resulted into a 0. +r1 resulted into a 0. 153 00:07:45,679 --> 00:07:47,884 @@ -704,17 +704,17 @@ 154 00:07:47,884 --> 00:07:49,190 -than I can just replace it +then I can just replace it 155 00:07:49,190 --> 00:07:53,375 -by r is two a simplified -version of R2. +by r2s, which a simplified +version of r2. 156 00:07:53,375 --> 00:07:58,820 -If it came back r as -two is actually 0, +If it came back r2s +is actually 0, 157 00:07:58,820 --> 00:08:00,410 @@ -722,7 +722,7 @@ 158 00:08:00,410 --> 00:08:03,785 -the simplified version of a warm. +the simplified version of a r1. 159 00:08:03,785 --> 00:08:07,460 @@ -736,7 +736,7 @@ 161 00:08:11,180 --> 00:08:15,170 -next century I can throw away +then I can throw away one of the alternatives. 162 @@ -746,11 +746,11 @@ 163 00:08:18,080 --> 00:08:21,155 -but the simplified components +but the simplified components. 164 00:08:21,155 --> 00:08:23,915 -in the sequence is +In the sequence case it is pretty much the same. 165 @@ -761,16 +761,16 @@ 166 00:08:27,950 --> 00:08:31,385 So I call first simplify -and then have a look. +and then have a look... 167 00:08:31,385 --> 00:08:33,020 -Does it matches C or one of +Does it matches 0 one of 168 00:08:33,020 --> 00:08:36,035 -the simplification -damage, just return 0. +the simplifications, +then I just return 0. 169 00:08:36,035 --> 00:08:38,330 @@ -787,15 +787,15 @@ 172 00:08:43,310 --> 00:08:45,920 -And if this iss two is one, +And if the rs2 is one, 173 00:08:45,920 --> 00:08:47,615 -and I keep the first component, +and I keep the first component. 174 00:08:47,615 --> 00:08:50,359 -and otherwise it's +And otherwise it's still the sequence. 175 @@ -805,13 +805,13 @@ 176 00:08:53,540 --> 00:08:55,700 -It's just a plain +If it's just a plain 0. I leave it in. 177 00:08:55,700 --> 00:08:57,860 If it's a plain -warm, I leave it in. +1, I leave it in. 178 00:08:57,860 --> 00:09:00,170 @@ -820,7 +820,7 @@ 179 00:09:00,170 --> 00:09:02,840 I don't open up this -door and simplify it. +star and simplify it. 180 00:09:02,840 --> 00:09:06,680 @@ -835,20 +835,20 @@ 182 00:09:10,280 --> 00:09:12,980 So I'm now in the -file read three, +file re3.sc, 183 00:09:12,980 --> 00:09:17,405 -and it's the same as Reto. +and it's the same as re2.sc. 184 00:09:17,405 --> 00:09:20,885 It still has this -explicit and Times case, +explicit and n-times case, 185 00:09:20,885 --> 00:09:24,665 -nullable and derivative +nullable and derivative are still the same. 186 @@ -862,12 +862,12 @@ 188 00:09:29,960 --> 00:09:33,725 -the derivative after -the apply deteriorated, +the derivative and after +we apply the derivative, 189 00:09:33,725 --> 00:09:35,870 -simplify it to throw away +we simplify it to throw away 190 00:09:35,870 --> 00:09:39,050 @@ -885,11 +885,11 @@ 193 00:09:43,580 --> 00:09:45,515 -of the optional reg expression. +of the optional regular expression. 194 00:09:45,515 --> 00:09:49,670 -Here, our two evil wreck +Here are our two EVIL regular expressions we use as a test. 195 @@ -898,27 +898,27 @@ 196 00:09:52,460 --> 00:09:55,835 -the first one and the +the first one and we're actually now trying it 197 00:09:55,835 --> 00:10:00,515 out for strings between -09 thousand a's. +0 and 9000 a's. 198 00:10:00,515 --> 00:10:08,285 -So C. So also gets -simplification makes a, +So let's see. So also the +simplification makes 199 00:10:08,285 --> 00:10:10,655 -actually this case fast on. +actually this case faster. 200 00:10:10,655 --> 00:10:16,260 So we can now run strings -up to 9 thousand. +up to 9000. 201 00:10:16,260 --> 00:10:19,360 @@ -928,7 +928,7 @@ 202 00:10:19,360 --> 00:10:24,745 For I have to say my laptop -is now starting its fan. +is now starting to run its fan. 203 00:10:24,745 --> 00:10:28,525 @@ -951,12 +951,12 @@ 207 00:10:34,720 --> 00:10:37,720 just to be a tiny bit -more accurate map. +more accurate. 208 00:10:37,720 --> 00:10:42,025 So three seconds for a -string of 9 thousand a's. +string of 9000 a's. 209 00:10:42,025 --> 00:10:44,980 @@ -969,16 +969,16 @@ 211 00:10:46,240 --> 00:10:48,625 -this E star, star b. +this a star, star, b. 212 00:10:48,625 --> 00:10:51,250 And you can already see -on these numbers RB. +on these numbers... 213 00:10:51,250 --> 00:10:53,260 -And now you're really ambitious. +we are really ambitious. 214 00:10:53,260 --> 00:10:57,860 @@ -992,7 +992,7 @@ 216 00:11:07,780 --> 00:11:10,480 -The laptop thefts to +The laptop has to calculate quite a bit. 217 @@ -1001,7 +1001,7 @@ 218 00:11:12,940 --> 00:11:16,539 -3 million A's and +3 million a's and it needs a second. 219 @@ -1010,7 +1010,7 @@ 220 00:11:20,680 --> 00:11:25,280 -4 million a around two seconds. +4 million a's around two seconds. 221 00:11:27,030 --> 00:11:29,050 @@ -1018,11 +1018,11 @@ 222 00:11:29,050 --> 00:11:31,690 -I'm actually slowing it down. +I'm actually slowing it down 223 00:11:31,690 --> 00:11:34,150 -He artificially run the test +here artificially with running the test 224 00:11:34,150 --> 00:11:36,895 @@ -1037,7 +1037,7 @@ 226 00:11:42,749 --> 00:11:48,185 And remember that is a -string of 6 million A's. +string of 6 million a's. 227 00:11:48,185 --> 00:11:51,170 @@ -1047,7 +1047,7 @@ 228 00:11:51,170 --> 00:11:56,105 So the simplification also -made of first case faster. +made our first case faster. 229 00:11:56,105 --> 00:11:58,880 @@ -1057,11 +1057,11 @@ 230 00:11:58,880 --> 00:12:00,710 where we just have -this explicit and +this explicit 231 00:12:00,710 --> 00:12:03,050 -times that at this graph. +n-times. That is this graph. 232 00:12:03,050 --> 00:12:05,210 @@ -1083,21 +1083,21 @@ 236 00:12:16,745 --> 00:12:19,220 the existing regular -expression matches in +expression matchers in 237 00:12:19,220 --> 00:12:22,880 -Java eight, Python, +Java 8, Python, and JavaScript. 238 00:12:22,880 --> 00:12:26,030 And thanks to the -student be Ozma, +student we also 239 00:12:26,030 --> 00:12:27,935 -half a graph for Swift. +have a graph for Swift. 240 00:12:27,935 --> 00:12:29,750 @@ -1105,12 +1105,12 @@ 241 00:12:29,750 --> 00:12:33,320 -They need like for 30 Ace, +They need like for 30 a's, 242 00:12:33,320 --> 00:12:37,490 something like on the -magnitude of thirty-seconds. +magnitude of 30 seconds. 243 00:12:37,490 --> 00:12:39,740 @@ -1118,11 +1118,11 @@ 244 00:12:39,740 --> 00:12:42,665 -Java nine is slightly better. +Java 9 is slightly better. 245 00:12:42,665 --> 00:12:44,870 -Java nine and above this, +Java 9 and above, 246 00:12:44,870 --> 00:12:46,220 @@ -1130,7 +1130,7 @@ 247 00:12:46,220 --> 00:12:48,665 -the same exponential and nine, +the same example in Java 9, 248 00:12:48,665 --> 00:12:51,230 @@ -1139,18 +1139,18 @@ 249 00:12:51,230 --> 00:12:54,215 -like 40 thousand A's +like 40 thousand a's in half a minute. 250 00:12:54,215 --> 00:12:57,740 I have to admit I'm not -a 100% sure what they +100% sure what they 251 00:12:57,740 --> 00:13:03,575 did to improve the -performance between Java 89. +performance between Java 8 and 9. 252 00:13:03,575 --> 00:13:05,510 @@ -1172,7 +1172,7 @@ 256 00:13:12,230 --> 00:13:17,945 But that's still not -really competition fas. +really a competition for us. 257 00:13:17,945 --> 00:13:20,915 @@ -1184,8 +1184,8 @@ 259 00:13:24,335 --> 00:13:28,445 -We say it's something like -We need 6 million A's. +We said for something like +6 million a's we need 15 secs. 260 00:13:28,445 --> 00:13:31,760 @@ -1194,7 +1194,7 @@ 261 00:13:31,760 --> 00:13:33,770 -so he had needs 20 seconds. +so here it needs 20 seconds. 262 00:13:33,770 --> 00:13:37,250 @@ -1204,11 +1204,11 @@ 263 00:13:37,250 --> 00:13:40,865 But again, you can see -the trends. We rants. +the trends. We run. 264 00:13:40,865 --> 00:13:42,590 -Circles around them. +circles around them. 265 00:13:42,590 --> 00:13:46,530 @@ -1217,11 +1217,11 @@ 266 00:13:46,570 --> 00:13:49,774 So what's good about -this algorithm? +this algorithm 267 00:13:49,774 --> 00:13:51,605 -By pressure of ski? +by Brzozowski? 268 00:13:51,605 --> 00:13:54,500 @@ -1238,17 +1238,17 @@ 271 00:13:59,900 --> 00:14:03,950 -the End Times regular expression -is a little bit of work. +the n-times regular expression. +Is a little bit of work, but 272 00:14:03,950 --> 00:14:05,405 -We could extend that. +we could extend that. 273 00:14:05,405 --> 00:14:08,480 It also extends to the -not reg expression, +not regular expression, 274 00:14:08,480 --> 00:14:10,820 @@ -1275,7 +1275,7 @@ 279 00:14:22,955 --> 00:14:25,715 -Quite beautiful in Scala. +It's quite beautiful in Scala. 280 00:14:25,715 --> 00:14:28,160 @@ -1284,7 +1284,7 @@ 281 00:14:28,160 --> 00:14:31,760 -in C language or in Python. +in the C language or in Python. 282 00:14:31,760 --> 00:14:34,250 @@ -1293,12 +1293,12 @@ 283 00:14:34,250 --> 00:14:38,390 -Say the algorithm's actually -quite old already brush-off, +The algorithm is actually +quite old already. 284 00:14:38,390 --> 00:14:44,899 -ski wrote it down in +Brzozowski wrote it down in 1964 and his PhD thesis. 285 @@ -1313,12 +1313,12 @@ 287 00:14:54,095 --> 00:14:57,065 At the moment, we are -only solve the problem +only solved the problem 288 00:14:57,065 --> 00:15:02,240 -of Gmail reg expression -gamma string deaths, +of given a regular expression, +givven a string, does 289 00:15:02,240 --> 00:15:05,550 @@ -1327,8 +1327,8 @@ 290 00:15:05,550 --> 00:15:08,740 -The want to of course, use -it as part of our lexicon. +We want to of course, use +it as part of our lexer. 291 00:15:08,740 --> 00:15:12,370 @@ -1352,8 +1352,8 @@ 295 00:15:22,045 --> 00:15:26,110 -called suits Month and -the co-worker called Lou. +called Sulzmann and +the co-worker called Lu. 296 00:15:26,110 --> 00:15:30,880 @@ -1403,16 +1403,16 @@ 306 00:15:56,645 --> 00:15:58,550 -Is really fs string +Is it really true that if a string 307 00:15:58,550 --> 00:16:00,440 is in the language -of a reg expression. +of a regular expression. 308 00:16:00,440 --> 00:16:03,050 -Does that matter? Vd say yes. +Does that matter? I would say yes. 309 00:16:03,050 --> 00:16:07,070 @@ -1427,20 +1427,20 @@ 311 00:16:10,580 --> 00:16:13,775 actually extended to quite a -number of rec expressions. +number of regular expressions. 312 00:16:13,775 --> 00:16:17,810 -So this is t not reg +So this is the NOT regular expression that is 313 00:16:17,810 --> 00:16:19,760 -opposed to match strings which +supposed to match strings which 314 00:16:19,760 --> 00:16:22,235 -this reg expression cannot match. +this regular expression cannot match. 315 00:16:22,235 --> 00:16:24,245 @@ -1448,12 +1448,12 @@ 316 00:16:24,245 --> 00:16:28,640 -You're in the business of +If you're in the business of trying to find out what 317 00:16:28,640 --> 00:16:33,530 -our comments in languages like +are comments in languages like Java or C. Then you know, 318 @@ -1515,7 +1515,7 @@ 330 00:17:03,680 --> 00:17:06,410 which doesn't -contain as standard, +contain a star and a slash, 331 00:17:06,410 --> 00:17:11,180 @@ -1528,7 +1528,7 @@ 333 00:17:13,460 --> 00:17:15,995 -this reg expression is +this regular expression is actually quite useful. 334 @@ -1538,17 +1538,17 @@ 335 00:17:19,430 --> 00:17:21,995 -this kinda break -expression with automata? +for this kind of regular +expression with automata, 336 00:17:21,995 --> 00:17:24,710 -You will know that's -a bit of a pain, +you will know that's +a bit of a pain. 337 00:17:24,710 --> 00:17:26,675 -but for the Borowski, +But for the Brzozowski, 338 00:17:26,675 --> 00:17:28,370 @@ -1557,7 +1557,7 @@ 339 00:17:28,370 --> 00:17:30,710 The meaning of this -reg expression +regular expression 340 00:17:30,710 --> 00:17:32,255 @@ -1574,7 +1574,7 @@ 343 00:17:38,390 --> 00:17:40,055 -this arc and match. +this r can match. 344 00:17:40,055 --> 00:17:42,080 @@ -1582,7 +1582,7 @@ 345 00:17:42,080 --> 00:17:44,630 -what this reg expression +what this regular expression is supposed to match. 346 @@ -1605,8 +1605,8 @@ 350 00:17:54,140 --> 00:17:56,570 -just a test where -the eyes nullable. +just a test whether r +is nullable. 351 00:17:56,570 --> 00:17:58,985 @@ -1615,12 +1615,12 @@ 352 00:17:58,985 --> 00:18:00,680 -And men I have to build +And then I have to build 353 00:18:00,680 --> 00:18:03,620 the derivative of this -not reg expression. +NOT regular expression. 354 00:18:03,620 --> 00:18:05,420 @@ -1628,16 +1628,16 @@ 355 00:18:05,420 --> 00:18:07,325 -this permutation does derivative, +this .... 356 00:18:07,325 --> 00:18:10,070 derivative inside -the wreck expression +the regular expression 357 00:18:10,070 --> 00:18:12,575 -and keep the not on +and keep the NOT on the outermost level. 358 @@ -1647,7 +1647,7 @@ 359 00:18:15,515 --> 00:18:19,130 -adding this reg expression +adding this regular expression to the algorithm. 360 @@ -1662,7 +1662,7 @@ 362 00:18:24,739 --> 00:18:27,620 -I can see to loose trends. +I can see to loose strands. 363 00:18:27,620 --> 00:18:29,420 @@ -1675,8 +1675,8 @@ 365 00:18:32,855 --> 00:18:38,705 -And we also added the end -times out of necessity. +And we also added the +n-times out of necessity. 366 00:18:38,705 --> 00:18:41,120 @@ -1685,7 +1685,7 @@ 367 00:18:41,120 --> 00:18:44,840 -the not regular +the NOT regular expression on a slide. 368 @@ -1696,7 +1696,7 @@ 369 00:18:48,995 --> 00:18:52,970 to extend that to some of -the extended reg expression. +the extended regular expressions. 370 00:18:52,970 --> 00:18:57,260 @@ -1705,20 +1705,20 @@ 371 00:18:57,260 --> 00:19:01,550 -n times or between n and m times. +n-times or between n and m times. 372 00:19:01,550 --> 00:19:04,325 And I think also -plus and D option. +plus and option. 373 00:19:04,325 --> 00:19:07,220 -The other loose trend is, +The other loose strand is, 374 00:19:07,220 --> 00:19:09,200 -remember I did this while +remember I did this wild 375 00:19:09,200 --> 00:19:11,645 @@ -1727,11 +1727,11 @@ 376 00:19:11,645 --> 00:19:13,205 -Why on earth? +Why on earth 377 00:19:13,205 --> 00:19:14,480 -Is that all correct? +is that all correct? 378 00:19:14,480 --> 00:19:16,355