--- a/videos/02-algo3.srt Mon Oct 05 01:42:47 2020 +0100
+++ b/videos/02-algo3.srt Mon Oct 05 17:46:12 2020 +0100
@@ -1,6 +1,6 @@
1
00:00:06,350 --> 00:00:10,305
-They come back. I
+Welcome come back. I
can hear you saying,
2
@@ -19,7 +19,7 @@
5
00:00:17,415 --> 00:00:21,265
Specifically, we had two
-regular expressions.
+evil regular expressions.
6
00:00:21,265 --> 00:00:23,480
@@ -27,13 +27,13 @@
7
00:00:23,480 --> 00:00:27,480
-Where let's have a look at
+Well, let's have a look at
that other case in this video.
8
00:00:29,140 --> 00:00:32,705
So I'm back here
-in this Reto file.
+in this re2.sc file.
9
00:00:32,705 --> 00:00:35,180
@@ -42,7 +42,7 @@
10
00:00:35,180 --> 00:00:39,665
-nicely for strings between 01000.
+nicely for strings between 0 and 1000.
11
00:00:39,665 --> 00:00:42,725
@@ -55,11 +55,11 @@
13
00:00:44,090 --> 00:00:48,470
for relatively small
-strings between 020.
+strings between 0 and 20.
14
00:00:48,470 --> 00:00:50,240
-And let's see what it say.
+And let's see what it says.
15
00:00:50,240 --> 00:00:51,800
@@ -67,13 +67,13 @@
16
00:00:51,800 --> 00:00:57,320
-the ammonoids repo and
+the Amoonite REPL and it
doesn't look too bad.
17
00:00:57,320 --> 00:01:01,160
But this might be because
-20 is not general enough.
+20 is not generous enough.
18
00:01:01,160 --> 00:01:03,620
@@ -120,12 +120,12 @@
28
00:01:33,905 --> 00:01:37,640
-which is Daphne bad and we
+which is definitely bad and we
have to have a look at that.
29
00:01:37,640 --> 00:01:40,640
-Okay? It's always
+It's always
instructive with
30
@@ -135,13 +135,13 @@
31
00:01:43,460 --> 00:01:45,695
-the record expressions
-we calculate
+the regular expressions
+we calculate.
32
00:01:45,695 --> 00:01:49,625
-the evil to Is this
-a star, star B.
+The EVIL2 is this
+a star, star b.
33
00:01:49,625 --> 00:01:51,800
@@ -174,15 +174,15 @@
39
00:02:05,600 --> 00:02:08,420
-If I take this evil to wreck
+If I take this EVIL2 regular
40
00:02:08,420 --> 00:02:09,935
-expression and they built
+expression and then build
41
00:02:09,935 --> 00:02:12,470
-now longer and
+longer and
longer derivatives,
42
@@ -209,7 +209,7 @@
47
00:02:21,680 --> 00:02:26,870
But the size of the
-reg expression,
+regular expression,
48
00:02:26,870 --> 00:02:28,310
@@ -235,7 +235,7 @@
53
00:02:39,560 --> 00:02:42,785
-This reg expression
+This regular expression
just grew too much.
54
@@ -245,13 +245,13 @@
55
00:02:46,520 --> 00:02:49,655
-this record expression
+this regular expression
and make it not
56
00:02:49,655 --> 00:02:52,700
grow so much when we
-build derivative.
+build the derivative.
57
00:02:52,700 --> 00:02:54,020
@@ -279,7 +279,7 @@
62
00:03:03,740 --> 00:03:06,560
this example of a
-wreck expression R,
+regulat expression r,
63
00:03:06,560 --> 00:03:11,465
@@ -312,15 +312,15 @@
69
00:03:26,570 --> 00:03:29,015
which is this whole
-wreck expression again.
+regular expression again.
70
00:03:29,015 --> 00:03:30,935
-So you can also see.
+So you can also see
71
00:03:30,935 --> 00:03:34,745
-Derivative operation
+the derivative operation
introduces a lot of junk.
72
@@ -330,7 +330,7 @@
73
00:03:38,165 --> 00:03:42,455
-And this times one could
+And this times 1 could
be also thrown away.
74
@@ -344,16 +344,16 @@
76
00:03:48,110 --> 00:03:49,580
-it still be a,
+it still b,
77
00:03:49,580 --> 00:03:54,905
-so it would be B followed by
-R. And in this second case,
+so it would be b followed by
+r. And in this second case,
78
00:03:54,905 --> 00:03:57,515
-C0 times b, that will be just 0.
+0 times b, that will be just 0.
79
00:03:57,515 --> 00:03:59,270
@@ -361,13 +361,13 @@
80
00:03:59,270 --> 00:04:05,330
-11 times r would be just
+1 ... times r would be just
r. And in the last case,
81
00:04:05,330 --> 00:04:12,155
-C0 times B would be 00 plus
-0 is 00 times r would be 0.
+0 times b would be 0. 0 plus
+0 is 0. 0 times r would be 0.
82
00:04:12,155 --> 00:04:17,540
@@ -391,8 +391,8 @@
86
00:04:27,035 --> 00:04:31,130
-0 wouldn't go away by
-building annuity where tests.
+And the 0s wouldn't go away by
+building a new derivative.
87
00:04:31,130 --> 00:04:34,310
@@ -435,7 +435,7 @@
95
00:04:58,715 --> 00:05:01,415
this might have
-introduced some chunk.
+introduced some junk.
96
00:05:01,415 --> 00:05:02,960
@@ -443,7 +443,7 @@
97
00:05:02,960 --> 00:05:06,590
-essentially a additional function
+essentially an additional function
98
00:05:06,590 --> 00:05:08,945
@@ -455,15 +455,15 @@
100
00:05:10,490 --> 00:05:13,340
-keep steer, Rekha expression,
+keeps the regular expression,
101
00:05:13,340 --> 00:05:16,730
-relatively small, pickers debts,
+relatively small, because that
102
00:05:16,730 --> 00:05:19,535
-the Achilles heel
+is the Achilles' heel
of this algorithm.
103
@@ -488,7 +488,7 @@
107
00:05:30,455 --> 00:05:33,080
compress again this
-rec expression
+regular expression
108
00:05:33,080 --> 00:05:35,570
@@ -506,7 +506,7 @@
111
00:05:42,440 --> 00:05:46,040
-this reg expression and
+this regular expression and
deletes all this junk,
112
@@ -539,7 +539,7 @@
118
00:06:02,720 --> 00:06:05,060
-The only simplify when we have
+We only simplify when we have
119
00:06:05,060 --> 00:06:08,255
@@ -585,16 +585,16 @@
128
00:06:30,530 --> 00:06:34,880
Now, however, there
-is one small point.
+is one small point
129
00:06:34,880 --> 00:06:39,785
-You have to be aware of
-these simplification rules.
+you have to be aware of.
+These simplification rules
130
00:06:39,785 --> 00:06:42,740
-They need to be essentially
+they need to be essentially
applied backwards.
131
@@ -604,12 +604,12 @@
132
00:06:45,650 --> 00:06:49,085
-this record expression and
+this regular expression and
then have to find out,
133
00:06:49,085 --> 00:06:51,170
-can I apply simplification?
+can I apply the simplification?
134
00:06:51,170 --> 00:06:52,670
@@ -655,26 +655,26 @@
143
00:07:16,850 --> 00:07:20,885
-what that means is the
-matching this reg expression.
+what that means we're
+matching this regular expression.
144
00:07:20,885 --> 00:07:23,120
-Be test whether it's
+We test whether it's
an alternative or
145
00:07:23,120 --> 00:07:26,345
-a sequence only then we
+a sequence. Only then we
actually go into action.
146
00:07:26,345 --> 00:07:28,385
-Once we know.
+Once we know
147
00:07:28,385 --> 00:07:30,530
-In case of an alternative,
+in case of an alternative,
148
00:07:30,530 --> 00:07:32,120
@@ -682,7 +682,7 @@
149
00:07:32,120 --> 00:07:35,765
-P first, simplify each component,
+We first, simplify each component,
150
00:07:35,765 --> 00:07:39,095
@@ -696,7 +696,7 @@
152
00:07:41,180 --> 00:07:45,679
this simplification of
-R1 resulted into a 0.
+r1 resulted into a 0.
153
00:07:45,679 --> 00:07:47,884
@@ -704,17 +704,17 @@
154
00:07:47,884 --> 00:07:49,190
-than I can just replace it
+then I can just replace it
155
00:07:49,190 --> 00:07:53,375
-by r is two a simplified
-version of R2.
+by r2s, which a simplified
+version of r2.
156
00:07:53,375 --> 00:07:58,820
-If it came back r as
-two is actually 0,
+If it came back r2s
+is actually 0,
157
00:07:58,820 --> 00:08:00,410
@@ -722,7 +722,7 @@
158
00:08:00,410 --> 00:08:03,785
-the simplified version of a warm.
+the simplified version of a r1.
159
00:08:03,785 --> 00:08:07,460
@@ -736,7 +736,7 @@
161
00:08:11,180 --> 00:08:15,170
-next century I can throw away
+then I can throw away
one of the alternatives.
162
@@ -746,11 +746,11 @@
163
00:08:18,080 --> 00:08:21,155
-but the simplified components
+but the simplified components.
164
00:08:21,155 --> 00:08:23,915
-in the sequence is
+In the sequence case it is
pretty much the same.
165
@@ -761,16 +761,16 @@
166
00:08:27,950 --> 00:08:31,385
So I call first simplify
-and then have a look.
+and then have a look...
167
00:08:31,385 --> 00:08:33,020
-Does it matches C or one of
+Does it matches 0 one of
168
00:08:33,020 --> 00:08:36,035
-the simplification
-damage, just return 0.
+the simplifications,
+then I just return 0.
169
00:08:36,035 --> 00:08:38,330
@@ -787,15 +787,15 @@
172
00:08:43,310 --> 00:08:45,920
-And if this iss two is one,
+And if the rs2 is one,
173
00:08:45,920 --> 00:08:47,615
-and I keep the first component,
+and I keep the first component.
174
00:08:47,615 --> 00:08:50,359
-and otherwise it's
+And otherwise it's
still the sequence.
175
@@ -805,13 +805,13 @@
176
00:08:53,540 --> 00:08:55,700
-It's just a plain
+If it's just a plain
0. I leave it in.
177
00:08:55,700 --> 00:08:57,860
If it's a plain
-warm, I leave it in.
+1, I leave it in.
178
00:08:57,860 --> 00:09:00,170
@@ -820,7 +820,7 @@
179
00:09:00,170 --> 00:09:02,840
I don't open up this
-door and simplify it.
+star and simplify it.
180
00:09:02,840 --> 00:09:06,680
@@ -835,20 +835,20 @@
182
00:09:10,280 --> 00:09:12,980
So I'm now in the
-file read three,
+file re3.sc,
183
00:09:12,980 --> 00:09:17,405
-and it's the same as Reto.
+and it's the same as re2.sc.
184
00:09:17,405 --> 00:09:20,885
It still has this
-explicit and Times case,
+explicit and n-times case,
185
00:09:20,885 --> 00:09:24,665
-nullable and derivative
+nullable and derivative are
still the same.
186
@@ -862,12 +862,12 @@
188
00:09:29,960 --> 00:09:33,725
-the derivative after
-the apply deteriorated,
+the derivative and after
+we apply the derivative,
189
00:09:33,725 --> 00:09:35,870
-simplify it to throw away
+we simplify it to throw away
190
00:09:35,870 --> 00:09:39,050
@@ -885,11 +885,11 @@
193
00:09:43,580 --> 00:09:45,515
-of the optional reg expression.
+of the optional regular expression.
194
00:09:45,515 --> 00:09:49,670
-Here, our two evil wreck
+Here are our two EVIL regular
expressions we use as a test.
195
@@ -898,27 +898,27 @@
196
00:09:52,460 --> 00:09:55,835
-the first one and the
+the first one and we're
actually now trying it
197
00:09:55,835 --> 00:10:00,515
out for strings between
-09 thousand a's.
+0 and 9000 a's.
198
00:10:00,515 --> 00:10:08,285
-So C. So also gets
-simplification makes a,
+So let's see. So also the
+simplification makes
199
00:10:08,285 --> 00:10:10,655
-actually this case fast on.
+actually this case faster.
200
00:10:10,655 --> 00:10:16,260
So we can now run strings
-up to 9 thousand.
+up to 9000.
201
00:10:16,260 --> 00:10:19,360
@@ -928,7 +928,7 @@
202
00:10:19,360 --> 00:10:24,745
For I have to say my laptop
-is now starting its fan.
+is now starting to run its fan.
203
00:10:24,745 --> 00:10:28,525
@@ -951,12 +951,12 @@
207
00:10:34,720 --> 00:10:37,720
just to be a tiny bit
-more accurate map.
+more accurate.
208
00:10:37,720 --> 00:10:42,025
So three seconds for a
-string of 9 thousand a's.
+string of 9000 a's.
209
00:10:42,025 --> 00:10:44,980
@@ -969,16 +969,16 @@
211
00:10:46,240 --> 00:10:48,625
-this E star, star b.
+this a star, star, b.
212
00:10:48,625 --> 00:10:51,250
And you can already see
-on these numbers RB.
+on these numbers...
213
00:10:51,250 --> 00:10:53,260
-And now you're really ambitious.
+we are really ambitious.
214
00:10:53,260 --> 00:10:57,860
@@ -992,7 +992,7 @@
216
00:11:07,780 --> 00:11:10,480
-The laptop thefts to
+The laptop has to
calculate quite a bit.
217
@@ -1001,7 +1001,7 @@
218
00:11:12,940 --> 00:11:16,539
-3 million A's and
+3 million a's and
it needs a second.
219
@@ -1010,7 +1010,7 @@
220
00:11:20,680 --> 00:11:25,280
-4 million a around two seconds.
+4 million a's around two seconds.
221
00:11:27,030 --> 00:11:29,050
@@ -1018,11 +1018,11 @@
222
00:11:29,050 --> 00:11:31,690
-I'm actually slowing it down.
+I'm actually slowing it down
223
00:11:31,690 --> 00:11:34,150
-He artificially run the test
+here artificially with running the test
224
00:11:34,150 --> 00:11:36,895
@@ -1037,7 +1037,7 @@
226
00:11:42,749 --> 00:11:48,185
And remember that is a
-string of 6 million A's.
+string of 6 million a's.
227
00:11:48,185 --> 00:11:51,170
@@ -1047,7 +1047,7 @@
228
00:11:51,170 --> 00:11:56,105
So the simplification also
-made of first case faster.
+made our first case faster.
229
00:11:56,105 --> 00:11:58,880
@@ -1057,11 +1057,11 @@
230
00:11:58,880 --> 00:12:00,710
where we just have
-this explicit and
+this explicit
231
00:12:00,710 --> 00:12:03,050
-times that at this graph.
+n-times. That is this graph.
232
00:12:03,050 --> 00:12:05,210
@@ -1083,21 +1083,21 @@
236
00:12:16,745 --> 00:12:19,220
the existing regular
-expression matches in
+expression matchers in
237
00:12:19,220 --> 00:12:22,880
-Java eight, Python,
+Java 8, Python,
and JavaScript.
238
00:12:22,880 --> 00:12:26,030
And thanks to the
-student be Ozma,
+student we also
239
00:12:26,030 --> 00:12:27,935
-half a graph for Swift.
+have a graph for Swift.
240
00:12:27,935 --> 00:12:29,750
@@ -1105,12 +1105,12 @@
241
00:12:29,750 --> 00:12:33,320
-They need like for 30 Ace,
+They need like for 30 a's,
242
00:12:33,320 --> 00:12:37,490
something like on the
-magnitude of thirty-seconds.
+magnitude of 30 seconds.
243
00:12:37,490 --> 00:12:39,740
@@ -1118,11 +1118,11 @@
244
00:12:39,740 --> 00:12:42,665
-Java nine is slightly better.
+Java 9 is slightly better.
245
00:12:42,665 --> 00:12:44,870
-Java nine and above this,
+Java 9 and above,
246
00:12:44,870 --> 00:12:46,220
@@ -1130,7 +1130,7 @@
247
00:12:46,220 --> 00:12:48,665
-the same exponential and nine,
+the same example in Java 9,
248
00:12:48,665 --> 00:12:51,230
@@ -1139,18 +1139,18 @@
249
00:12:51,230 --> 00:12:54,215
-like 40 thousand A's
+like 40 thousand a's
in half a minute.
250
00:12:54,215 --> 00:12:57,740
I have to admit I'm not
-a 100% sure what they
+100% sure what they
251
00:12:57,740 --> 00:13:03,575
did to improve the
-performance between Java 89.
+performance between Java 8 and 9.
252
00:13:03,575 --> 00:13:05,510
@@ -1172,7 +1172,7 @@
256
00:13:12,230 --> 00:13:17,945
But that's still not
-really competition fas.
+really a competition for us.
257
00:13:17,945 --> 00:13:20,915
@@ -1184,8 +1184,8 @@
259
00:13:24,335 --> 00:13:28,445
-We say it's something like
-We need 6 million A's.
+We said for something like
+6 million a's we need 15 secs.
260
00:13:28,445 --> 00:13:31,760
@@ -1194,7 +1194,7 @@
261
00:13:31,760 --> 00:13:33,770
-so he had needs 20 seconds.
+so here it needs 20 seconds.
262
00:13:33,770 --> 00:13:37,250
@@ -1204,11 +1204,11 @@
263
00:13:37,250 --> 00:13:40,865
But again, you can see
-the trends. We rants.
+the trends. We run.
264
00:13:40,865 --> 00:13:42,590
-Circles around them.
+circles around them.
265
00:13:42,590 --> 00:13:46,530
@@ -1217,11 +1217,11 @@
266
00:13:46,570 --> 00:13:49,774
So what's good about
-this algorithm?
+this algorithm
267
00:13:49,774 --> 00:13:51,605
-By pressure of ski?
+by Brzozowski?
268
00:13:51,605 --> 00:13:54,500
@@ -1238,17 +1238,17 @@
271
00:13:59,900 --> 00:14:03,950
-the End Times regular expression
-is a little bit of work.
+the n-times regular expression.
+Is a little bit of work, but
272
00:14:03,950 --> 00:14:05,405
-We could extend that.
+we could extend that.
273
00:14:05,405 --> 00:14:08,480
It also extends to the
-not reg expression,
+not regular expression,
274
00:14:08,480 --> 00:14:10,820
@@ -1275,7 +1275,7 @@
279
00:14:22,955 --> 00:14:25,715
-Quite beautiful in Scala.
+It's quite beautiful in Scala.
280
00:14:25,715 --> 00:14:28,160
@@ -1284,7 +1284,7 @@
281
00:14:28,160 --> 00:14:31,760
-in C language or in Python.
+in the C language or in Python.
282
00:14:31,760 --> 00:14:34,250
@@ -1293,12 +1293,12 @@
283
00:14:34,250 --> 00:14:38,390
-Say the algorithm's actually
-quite old already brush-off,
+The algorithm is actually
+quite old already.
284
00:14:38,390 --> 00:14:44,899
-ski wrote it down in
+Brzozowski wrote it down in
1964 and his PhD thesis.
285
@@ -1313,12 +1313,12 @@
287
00:14:54,095 --> 00:14:57,065
At the moment, we are
-only solve the problem
+only solved the problem
288
00:14:57,065 --> 00:15:02,240
-of Gmail reg expression
-gamma string deaths,
+of given a regular expression,
+givven a string, does
289
00:15:02,240 --> 00:15:05,550
@@ -1327,8 +1327,8 @@
290
00:15:05,550 --> 00:15:08,740
-The want to of course, use
-it as part of our lexicon.
+We want to of course, use
+it as part of our lexer.
291
00:15:08,740 --> 00:15:12,370
@@ -1352,8 +1352,8 @@
295
00:15:22,045 --> 00:15:26,110
-called suits Month and
-the co-worker called Lou.
+called Sulzmann and
+the co-worker called Lu.
296
00:15:26,110 --> 00:15:30,880
@@ -1403,16 +1403,16 @@
306
00:15:56,645 --> 00:15:58,550
-Is really fs string
+Is it really true that if a string
307
00:15:58,550 --> 00:16:00,440
is in the language
-of a reg expression.
+of a regular expression.
308
00:16:00,440 --> 00:16:03,050
-Does that matter? Vd say yes.
+Does that matter? I would say yes.
309
00:16:03,050 --> 00:16:07,070
@@ -1427,20 +1427,20 @@
311
00:16:10,580 --> 00:16:13,775
actually extended to quite a
-number of rec expressions.
+number of regular expressions.
312
00:16:13,775 --> 00:16:17,810
-So this is t not reg
+So this is the NOT regular
expression that is
313
00:16:17,810 --> 00:16:19,760
-opposed to match strings which
+supposed to match strings which
314
00:16:19,760 --> 00:16:22,235
-this reg expression cannot match.
+this regular expression cannot match.
315
00:16:22,235 --> 00:16:24,245
@@ -1448,12 +1448,12 @@
316
00:16:24,245 --> 00:16:28,640
-You're in the business of
+If you're in the business of
trying to find out what
317
00:16:28,640 --> 00:16:33,530
-our comments in languages like
+are comments in languages like
Java or C. Then you know,
318
@@ -1515,7 +1515,7 @@
330
00:17:03,680 --> 00:17:06,410
which doesn't
-contain as standard,
+contain a star and a slash,
331
00:17:06,410 --> 00:17:11,180
@@ -1528,7 +1528,7 @@
333
00:17:13,460 --> 00:17:15,995
-this reg expression is
+this regular expression is
actually quite useful.
334
@@ -1538,17 +1538,17 @@
335
00:17:19,430 --> 00:17:21,995
-this kinda break
-expression with automata?
+for this kind of regular
+expression with automata,
336
00:17:21,995 --> 00:17:24,710
-You will know that's
-a bit of a pain,
+you will know that's
+a bit of a pain.
337
00:17:24,710 --> 00:17:26,675
-but for the Borowski,
+But for the Brzozowski,
338
00:17:26,675 --> 00:17:28,370
@@ -1557,7 +1557,7 @@
339
00:17:28,370 --> 00:17:30,710
The meaning of this
-reg expression
+regular expression
340
00:17:30,710 --> 00:17:32,255
@@ -1574,7 +1574,7 @@
343
00:17:38,390 --> 00:17:40,055
-this arc and match.
+this r can match.
344
00:17:40,055 --> 00:17:42,080
@@ -1582,7 +1582,7 @@
345
00:17:42,080 --> 00:17:44,630
-what this reg expression
+what this regular expression
is supposed to match.
346
@@ -1605,8 +1605,8 @@
350
00:17:54,140 --> 00:17:56,570
-just a test where
-the eyes nullable.
+just a test whether r
+is nullable.
351
00:17:56,570 --> 00:17:58,985
@@ -1615,12 +1615,12 @@
352
00:17:58,985 --> 00:18:00,680
-And men I have to build
+And then I have to build
353
00:18:00,680 --> 00:18:03,620
the derivative of this
-not reg expression.
+NOT regular expression.
354
00:18:03,620 --> 00:18:05,420
@@ -1628,16 +1628,16 @@
355
00:18:05,420 --> 00:18:07,325
-this permutation does derivative,
+this ....
356
00:18:07,325 --> 00:18:10,070
derivative inside
-the wreck expression
+the regular expression
357
00:18:10,070 --> 00:18:12,575
-and keep the not on
+and keep the NOT on
the outermost level.
358
@@ -1647,7 +1647,7 @@
359
00:18:15,515 --> 00:18:19,130
-adding this reg expression
+adding this regular expression
to the algorithm.
360
@@ -1662,7 +1662,7 @@
362
00:18:24,739 --> 00:18:27,620
-I can see to loose trends.
+I can see to loose strands.
363
00:18:27,620 --> 00:18:29,420
@@ -1675,8 +1675,8 @@
365
00:18:32,855 --> 00:18:38,705
-And we also added the end
-times out of necessity.
+And we also added the
+n-times out of necessity.
366
00:18:38,705 --> 00:18:41,120
@@ -1685,7 +1685,7 @@
367
00:18:41,120 --> 00:18:44,840
-the not regular
+the NOT regular
expression on a slide.
368
@@ -1696,7 +1696,7 @@
369
00:18:48,995 --> 00:18:52,970
to extend that to some of
-the extended reg expression.
+the extended regular expressions.
370
00:18:52,970 --> 00:18:57,260
@@ -1705,20 +1705,20 @@
371
00:18:57,260 --> 00:19:01,550
-n times or between n and m times.
+n-times or between n and m times.
372
00:19:01,550 --> 00:19:04,325
And I think also
-plus and D option.
+plus and option.
373
00:19:04,325 --> 00:19:07,220
-The other loose trend is,
+The other loose strand is,
374
00:19:07,220 --> 00:19:09,200
-remember I did this while
+remember I did this wild
375
00:19:09,200 --> 00:19:11,645
@@ -1727,11 +1727,11 @@
376
00:19:11,645 --> 00:19:13,205
-Why on earth?
+Why on earth
377
00:19:13,205 --> 00:19:14,480
-Is that all correct?
+is that all correct?
378
00:19:14,480 --> 00:19:16,355
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/videos/02-cw1.srt Mon Oct 05 17:46:12 2020 +0100
@@ -0,0 +1,1107 @@
+1
+00:00:06,410 --> 00:00:09,390
+The final video for this week.
+
+2
+00:00:09,390 --> 00:00:12,465
+And let me say something
+about the coursework.
+
+3
+00:00:12,465 --> 00:00:15,300
+First. You can solve
+
+4
+00:00:15,300 --> 00:00:17,745
+the coursework in any programming
+language you're like,
+
+5
+00:00:17,745 --> 00:00:22,440
+I already said that. You
+have to submit a PDF file.
+
+6
+00:00:22,440 --> 00:00:23,850
+There will be some questions,
+
+7
+00:00:23,850 --> 00:00:26,250
+so you have to write down the
+solution to this questions.
+
+8
+00:00:26,250 --> 00:00:30,315
+Please use a PDF and you have
+to submit your source code.
+
+9
+00:00:30,315 --> 00:00:35,580
+And yes, if you do use a
+language which isn't Scala,
+
+10
+00:00:35,580 --> 00:00:39,450
+it might help to tell me
+how I can run your code.
+
+11
+00:00:39,450 --> 00:00:42,550
+If I can't run your code,
+I will ask you anyway.
+
+12
+00:00:42,550 --> 00:00:50,044
+Also, please submit both the
+PDF and the code in a file,
+
+13
+00:00:50,044 --> 00:00:52,730
+in a zip file, which generates
+
+14
+00:00:52,730 --> 00:00:55,835
+a subdirectory with your
+name and your family name.
+
+15
+00:00:55,835 --> 00:00:58,970
+That we'll just save a
+lot of time for me to try
+
+16
+00:00:58,970 --> 00:01:02,030
+to figure out who
+has submitted what.
+
+17
+00:01:02,030 --> 00:01:04,445
+Please do that.
+
+18
+00:01:04,445 --> 00:01:07,789
+So what is the task
+into coursework?
+
+19
+00:01:07,789 --> 00:01:09,885
+I essentially showed you how
+
+20
+00:01:09,885 --> 00:01:11,995
+the Brzozowski matcher
+
+21
+00:01:11,995 --> 00:01:14,645
+works for the basic
+regular expressions.
+
+22
+00:01:14,645 --> 00:01:16,295
+I also showed you actually how it
+
+23
+00:01:16,295 --> 00:01:18,110
+works for the n-times.
+
+24
+00:01:18,110 --> 00:01:20,300
+And the task in coursework
+
+25
+00:01:20,300 --> 00:01:22,970
+is that you extend the
+Brzozowski matcher to
+
+26
+00:01:22,970 --> 00:01:25,820
+the other regular expressions
+
+27
+00:01:25,820 --> 00:01:27,800
+from the extended
+regular expressions.
+
+28
+00:01:27,800 --> 00:01:30,245
+So you're supposed
+to extended with
+
+29
+00:01:30,245 --> 00:01:34,490
+r-plus, optional r, for n-times.
+
+30
+00:01:34,490 --> 00:01:38,420
+You've already seen that.
+For 0 or more times,
+
+31
+00:01:38,420 --> 00:01:41,120
+but not more than m
+times regular expression.
+
+32
+00:01:41,120 --> 00:01:45,890
+For at least n-times regular
+expression and for between
+
+33
+00:01:45,890 --> 00:01:47,480
+n times and m times
+
+34
+00:01:47,480 --> 00:01:50,810
+regular expression and also this
+NOT regular expression.
+
+35
+00:01:50,810 --> 00:01:52,790
+So your task is to
+
+36
+00:01:52,790 --> 00:01:55,310
+essentially find the definition
+
+37
+00:01:55,310 --> 00:01:57,155
+for nullable in these cases.
+
+38
+00:01:57,155 --> 00:02:00,215
+Find a definition for derivative,
+
+39
+00:02:00,215 --> 00:02:02,480
+implement them,
+write them down in
+
+40
+00:02:02,480 --> 00:02:06,065
+a PDF and then do some
+experiments with them.
+
+41
+00:02:06,065 --> 00:02:08,875
+Well, how can you do that?
+
+42
+00:02:08,875 --> 00:02:10,370
+Well I've given you
+
+43
+00:02:10,370 --> 00:02:13,565
+the meaning of these additional
+regular expressions.
+
+44
+00:02:13,565 --> 00:02:16,730
+So here's, for example, the
+meaning of this r-plus.
+
+45
+00:02:16,730 --> 00:02:18,290
+So that would be, I have
+
+46
+00:02:18,290 --> 00:02:22,115
+at least one copy up to infinity.
+
+47
+00:02:22,115 --> 00:02:25,070
+And the optional-r would be it's
+
+48
+00:02:25,070 --> 00:02:28,370
+the language of r plus
+the empty string.
+
+49
+00:02:28,370 --> 00:02:30,440
+If I have it exactly n times,
+
+50
+00:02:30,440 --> 00:02:31,985
+then it would be
+just the language
+
+51
+00:02:31,985 --> 00:02:34,010
+of r exactly n-times.
+
+52
+00:02:34,010 --> 00:02:38,105
+And here you have union
+from 0 to m and so on.
+
+53
+00:02:38,105 --> 00:02:42,560
+This might be a slightly
+interesting regular expression.
+
+54
+00:02:42,560 --> 00:02:46,580
+So that's supposed to be the
+character set that is very
+
+55
+00:02:46,580 --> 00:02:48,410
+similar to character ranges like
+
+56
+00:02:48,410 --> 00:02:51,005
+in the existing regular
+expression matchers.
+
+57
+00:02:51,005 --> 00:02:52,820
+So this just says
+
+58
+00:02:52,820 --> 00:02:55,565
+...this regular
+expression can match
+
+59
+00:02:55,565 --> 00:03:00,860
+either the character c1 or
+the character c2 up to cn.
+
+60
+00:03:00,860 --> 00:03:03,620
+Why do I include
+that regular expression?
+
+61
+00:03:03,620 --> 00:03:09,050
+I could also just say
+c1 plus c2 plus c3...
+
+62
+00:03:09,050 --> 00:03:12,889
+a big alternative.
+Well that's possible.
+
+63
+00:03:12,889 --> 00:03:16,595
+But remember the Achilles'
+heel of this algorithm is,
+
+64
+00:03:16,595 --> 00:03:18,425
+if the regular expression is large,
+
+65
+00:03:18,425 --> 00:03:20,825
+then the computation
+take always a long time.
+
+66
+00:03:20,825 --> 00:03:23,840
+So I'm trying to compress
+that as much as I can.
+
+67
+00:03:23,840 --> 00:03:25,370
+So instead of giving a big
+
+68
+00:03:25,370 --> 00:03:29,134
+alternative, I am trying to
+give one regular expression,
+
+69
+00:03:29,134 --> 00:03:31,340
+which can not just match
+a single character,
+
+70
+00:03:31,340 --> 00:03:34,230
+but a set of characters.
+
+71
+00:03:34,630 --> 00:03:36,980
+How can you be sure that
+
+72
+00:03:36,980 --> 00:03:39,215
+definition you come
+up with are correct?
+
+73
+00:03:39,215 --> 00:03:41,450
+Well, I showed you which are
+
+74
+00:03:41,450 --> 00:03:45,170
+the properties these two
+functions need to satisfy.
+
+75
+00:03:45,170 --> 00:03:47,060
+I've given you here what
+
+76
+00:03:47,060 --> 00:03:49,325
+the meaning of these
+expressions are.
+
+77
+00:03:49,325 --> 00:03:52,700
+So you will always know what's
+on the right-hand side.
+
+78
+00:03:52,700 --> 00:03:54,515
+This is completely determined.
+
+79
+00:03:54,515 --> 00:03:56,720
+You essentially
+have to fill something
+
+80
+00:03:56,720 --> 00:03:58,910
+equivalent on the left-hand side.
+
+81
+00:03:58,910 --> 00:04:02,105
+That's the main task
+of the coursework.
+
+82
+00:04:02,105 --> 00:04:08,090
+And you can start from the
+files I provided on KEATS.
+
+83
+00:04:08,090 --> 00:04:12,125
+So you would just have to
+add these additional cases.
+
+84
+00:04:12,125 --> 00:04:15,020
+When you add these
+additional cases
+
+85
+00:04:15,020 --> 00:04:17,330
+and you're using the Scala language,
+
+86
+00:04:17,330 --> 00:04:18,980
+you can do me a favour.
+
+87
+00:04:18,980 --> 00:04:22,550
+You can call these
+constructors for
+
+88
+00:04:22,550 --> 00:04:25,400
+these different regular
+expressions as RANGE,
+
+89
+00:04:25,400 --> 00:04:28,985
+PLUS, OPTIONAL and NTIMES,
+UPTO, FROM and BETWEEN.
+
+90
+00:04:28,985 --> 00:04:31,370
+Remember I have this
+convention to always
+
+91
+00:04:31,370 --> 00:04:34,025
+use capital letters
+for regular expressions.
+
+92
+00:04:34,025 --> 00:04:36,680
+It would be nice if you could use
+
+93
+00:04:36,680 --> 00:04:39,260
+these names because
+then it will be
+
+94
+00:04:39,260 --> 00:04:42,335
+very easy for me
+to test your code.
+
+95
+00:04:42,335 --> 00:04:44,690
+If you're using a different
+programming language
+
+96
+00:04:44,690 --> 00:04:46,370
+like let's say Rust,
+
+97
+00:04:46,370 --> 00:04:48,860
+I expect some people will use that, where I
+
+98
+00:04:48,860 --> 00:04:51,380
+have no idea what are the
+conventions in your language,
+
+99
+00:04:51,380 --> 00:04:53,420
+how you have to call
+these constructors,
+
+100
+00:04:53,420 --> 00:04:56,420
+we will see whatever you
+submit. I'll have a look.
+
+101
+00:04:56,420 --> 00:04:59,360
+There's one more
+constraint I have to
+
+102
+00:04:59,360 --> 00:05:02,194
+impose to make this
+coursework interesting.
+
+103
+00:05:02,194 --> 00:05:04,730
+I do not want you
+that you just take
+
+104
+00:05:04,730 --> 00:05:07,370
+these extended regular
+expressions and that you
+
+105
+00:05:07,370 --> 00:05:10,550
+expand them using
+their definition.
+
+106
+00:05:10,550 --> 00:05:12,320
+Because that would make the regular
+
+107
+00:05:12,320 --> 00:05:13,955
+expression matcher very slow.
+
+108
+00:05:13,955 --> 00:05:16,160
+So for example, if
+you want to define
+
+109
+00:05:16,160 --> 00:05:18,785
+what's the derivative for r-plus,
+
+110
+00:05:18,785 --> 00:05:21,560
+then what does not
+count as a solution:
+
+111
+00:05:21,560 --> 00:05:24,770
+if you just expand that
+to the definition that r
+
+112
+00:05:24,770 --> 00:05:28,935
+plus is equivalent to
+r followed by r-star.
+
+113
+00:05:28,935 --> 00:05:32,845
+So in code, what I
+would like to not see,
+
+114
+00:05:32,845 --> 00:05:35,680
+I would not give any
+marks for that is,
+
+115
+00:05:35,680 --> 00:05:37,780
+if you add the plus as follows,
+
+116
+00:05:37,780 --> 00:05:39,910
+so that is still perfectly fine.
+
+117
+00:05:39,910 --> 00:05:42,160
+There's nothing you
+can do differently.
+
+118
+00:05:42,160 --> 00:05:44,065
+That would be the constructor.
+
+119
+00:05:44,065 --> 00:05:46,480
+But when you define nullable,
+
+120
+00:05:46,480 --> 00:05:49,180
+I do not want to see
+that you defined
+
+121
+00:05:49,180 --> 00:05:51,790
+this plus r as nullable
+
+122
+00:05:51,790 --> 00:05:54,385
+of the sequence of r and star-r,
+
+123
+00:05:54,385 --> 00:05:58,075
+just unfolding
+the definition.
+
+124
+00:05:58,075 --> 00:06:00,415
+As you can see, when you come
+
+125
+00:06:00,415 --> 00:06:02,815
+up with a much better
+solution for that.
+
+126
+00:06:02,815 --> 00:06:05,110
+This is a very inefficient way
+
+127
+00:06:05,110 --> 00:06:07,090
+for how to calculate nullable.
+
+128
+00:06:07,090 --> 00:06:10,825
+And so the same for derivative
+would not be allowed.
+
+129
+00:06:10,825 --> 00:06:13,895
+If you, I have the plus r here,
+
+130
+00:06:13,895 --> 00:06:16,685
+you can't just unfold
+the definition,
+
+131
+00:06:16,685 --> 00:06:19,790
+with r-plus
+being defined as r
+
+132
+00:06:19,790 --> 00:06:21,350
+followed by r star and
+
+133
+00:06:21,350 --> 00:06:23,375
+then just build the
+derivative of that.
+
+134
+00:06:23,375 --> 00:06:25,280
+You have to find something more
+
+135
+00:06:25,280 --> 00:06:28,730
+primitive that involves
+only the derivative of r,
+
+136
+00:06:28,730 --> 00:06:31,790
+essentially, nothing
+more involved. The same
+
+137
+00:06:31,790 --> 00:06:33,815
+if you have n-times, for example,
+
+138
+00:06:33,815 --> 00:06:39,935
+you can't just unfold this
+n-times in this sequence
+
+139
+00:06:39,935 --> 00:06:43,310
+of .... n-copies
+
+140
+00:06:43,310 --> 00:06:47,285
+of this r. You have to get
+something more primitive.
+
+141
+00:06:47,285 --> 00:06:49,760
+Otherwise, as you've
+seen earlier,
+
+142
+00:06:49,760 --> 00:06:53,135
+your regular expression matcher
+would be totally slow.
+
+143
+00:06:53,135 --> 00:06:55,475
+When you submit your solution,
+
+144
+00:06:55,475 --> 00:06:58,445
+please submit this
+solution in the PDF.
+
+145
+00:06:58,445 --> 00:07:01,580
+So give the cases of your definition
+
+146
+00:07:01,580 --> 00:07:06,004
+in a form like that for
+nullable and derivative.
+
+147
+00:07:06,004 --> 00:07:08,405
+And also implement it in code.
+
+148
+00:07:08,405 --> 00:07:10,910
+That is just helping me to
+
+149
+00:07:10,910 --> 00:07:13,400
+find out whether you have
+the correct solution or not.
+
+150
+00:07:13,400 --> 00:07:15,710
+So you needed a kind of
+mathematical notation of
+
+151
+00:07:15,710 --> 00:07:18,695
+your solution, and
+an actual code.
+
+152
+00:07:18,695 --> 00:07:22,414
+And then once you
+have your definition,
+
+153
+00:07:22,414 --> 00:07:25,699
+also make sure you try
+it out on some examples.
+
+154
+00:07:25,699 --> 00:07:28,970
+These will be the examples
+I will definitely try out,
+
+155
+00:07:28,970 --> 00:07:30,560
+but probably also more
+
+156
+00:07:30,560 --> 00:07:33,215
+depending on what
+definitions you give me.
+
+157
+00:07:33,215 --> 00:07:38,300
+There's one more question I
+ask you to do. You've seen
+
+158
+00:07:38,300 --> 00:07:40,130
+there's some regular
+expressions that
+
+159
+00:07:40,130 --> 00:07:42,215
+are involved for characters,
+
+160
+00:07:42,215 --> 00:07:46,925
+for character ranges or
+character sets essentially.
+
+161
+00:07:46,925 --> 00:07:48,665
+And sometimes I also want to have
+
+162
+00:07:48,665 --> 00:07:51,665
+just a reg expression which
+can match any character!!
+
+163
+00:07:51,665 --> 00:07:56,195
+And I could have for all of
+them separate definitions.
+
+164
+00:07:56,195 --> 00:07:57,710
+And that would mean I also need
+
+165
+00:07:57,710 --> 00:07:59,645
+separate definitions for nullable,
+
+166
+00:07:59,645 --> 00:08:02,405
+separate definitions
+for derivative.
+
+167
+00:08:02,405 --> 00:08:05,825
+But actually they can be
+generalized to just one constructor.
+
+168
+00:08:05,825 --> 00:08:08,210
+And that is if we introduce
+
+169
+00:08:08,210 --> 00:08:11,630
+a constructor for regular expressions,
+
+170
+00:08:11,630 --> 00:08:13,760
+which not just takes
+a single character
+
+171
+00:08:13,760 --> 00:08:15,469
+or set of characters.
+
+172
+00:08:15,469 --> 00:08:18,245
+But, which I call here CFUN.
+
+173
+00:08:18,245 --> 00:08:23,165
+And it takes a function from
+characters to booleans.
+
+174
+00:08:23,165 --> 00:08:24,470
+So if I want to match
+
+175
+00:08:24,470 --> 00:08:26,900
+just a single character
+then this function f
+
+176
+00:08:26,900 --> 00:08:29,060
+would only say true
+
+177
+00:08:29,060 --> 00:08:32,225
+if it gets as argument
+the single character.
+
+178
+00:08:32,225 --> 00:08:34,850
+If I have a character set,
+
+179
+00:08:34,850 --> 00:08:36,515
+then I would say, well,
+
+180
+00:08:36,515 --> 00:08:38,300
+I need a function
+which says true
+
+181
+00:08:38,300 --> 00:08:40,970
+for all the characters
+in this set.
+
+182
+00:08:40,970 --> 00:08:43,460
+And otherwise it's false.
+
+183
+00:08:43,460 --> 00:08:46,205
+And if I want to
+match any character!!,
+
+184
+00:08:46,205 --> 00:08:48,500
+then they should get a function
+
+185
+00:08:48,500 --> 00:08:53,450
+which says true for
+all characters.
+
+186
+00:08:53,450 --> 00:08:56,630
+Okay? If I have
+such a constructor
+
+187
+00:08:56,630 --> 00:09:00,140
+that actually I can save
+myself a bit of work.
+
+188
+00:09:00,140 --> 00:09:03,200
+And I can just have one case
+
+189
+00:09:03,200 --> 00:09:06,920
+for nullable and one
+case for CFUNS.
+
+190
+00:09:06,920 --> 00:09:09,800
+So that would then subsume
+the character ranges and
+
+191
+00:09:09,800 --> 00:09:13,385
+the character and also
+this ALL regular expression.
+
+192
+00:09:13,385 --> 00:09:15,380
+Ok, this was not explicitly
+included at the beginning,
+
+193
+00:09:15,380 --> 00:09:17,510
+but that's what I can now define.
+
+194
+00:09:17,510 --> 00:09:21,740
+And then I can define
+this for characters,
+
+195
+00:09:21,740 --> 00:09:23,885
+just as special cases
+for these functions.
+
+196
+00:09:23,885 --> 00:09:25,610
+And I told you already
+what this function
+
+197
+00:09:25,610 --> 00:09:28,025
+should look like in
+these three cases.
+
+198
+00:09:28,025 --> 00:09:30,350
+So I let ...
+
+199
+00:09:30,350 --> 00:09:33,515
+you decide how you're
+going to implement that.
+
+200
+00:09:33,515 --> 00:09:35,450
+If you first define
+
+201
+00:09:35,450 --> 00:09:38,495
+your implementation is
+all these explicit cases.
+
+202
+00:09:38,495 --> 00:09:41,900
+Because in these cases it's
+actually more intuitive,
+
+203
+00:09:41,900 --> 00:09:45,980
+what nullable and
+derivative should be.
+
+204
+00:09:45,980 --> 00:09:48,035
+And then in a second step,
+
+205
+00:09:48,035 --> 00:09:51,140
+you implement these
+more general cases.
+
+206
+00:09:51,140 --> 00:09:53,360
+And just keep the original ones
+
+207
+00:09:53,360 --> 00:09:54,500
+in your implementation if you
+
+208
+00:09:54,500 --> 00:09:57,710
+want, or if you feel
+adventurous enough,
+
+209
+00:09:57,710 --> 00:09:59,225
+and want to be lazy,
+
+210
+00:09:59,225 --> 00:10:01,115
+that you just implement that.
+
+211
+00:10:01,115 --> 00:10:02,660
+And then you have already done
+
+212
+00:10:02,660 --> 00:10:06,530
+at least two constructors
+by just implementing one.
+
+213
+00:10:06,530 --> 00:10:08,915
+But as said that
+requires a bit skill
+
+214
+00:10:08,915 --> 00:10:11,450
+of generalizing how
+that should be.
+
+215
+00:10:11,450 --> 00:10:14,180
+And the other questions
+are just then
+
+216
+00:10:14,180 --> 00:10:16,745
+trying out your
+expression matcher on
+
+217
+00:10:16,745 --> 00:10:19,580
+some examples, more
+power examples,
+
+218
+00:10:19,580 --> 00:10:22,400
+and they should be
+very easy to solve.
+
+219
+00:10:22,400 --> 00:10:24,605
+So I hope it's not
+too much asked for
+
+220
+00:10:24,605 --> 00:10:26,615
+and I hope you have fun.
+
+221
+00:10:26,615 --> 00:10:29,675
+It is not too much ask
+because you can,
+
+222
+00:10:29,675 --> 00:10:32,420
+I hope it's not too much
+because you can start from
+
+223
+00:10:32,420 --> 00:10:35,615
+my definitions or
+my implementation.
+
+224
+00:10:35,615 --> 00:10:39,425
+It's really essentially
+mostly is brain work,
+
+225
+00:10:39,425 --> 00:10:42,170
+how these nullable and
+derivative should work.
+
+226
+00:10:42,170 --> 00:10:44,510
+If you're in a
+language like Scala,
+
+227
+00:10:44,510 --> 00:10:48,875
+the implementation should
+then be easy-peasy.
+
+228
+00:10:48,875 --> 00:10:51,365
+If you are in a different language
+
+229
+00:10:51,365 --> 00:10:52,610
+I assume you also
+
+230
+00:10:52,610 --> 00:10:54,890
+dedicated to that
+language to start with,
+
+231
+00:10:54,890 --> 00:10:58,475
+so it should be also very
+easy for you to translate
+
+232
+00:10:58,475 --> 00:11:01,100
+my Scala code into whatever
+language you want to
+
+233
+00:11:01,100 --> 00:11:04,280
+do, first and then
+start from there.
+
+234
+00:11:04,280 --> 00:11:06,230
+If there are any questions,
+
+235
+00:11:06,230 --> 00:11:08,360
+please ask me or the TAs.
+
+236
+00:11:08,360 --> 00:11:10,040
+We are trying to be as helpful
+
+237
+00:11:10,040 --> 00:11:12,900
+as possible with the coursework.
+
+238
+00:11:13,240 --> 00:11:17,885
+Bear in mind, this is the
+first step in our compiler.
+
+239
+00:11:17,885 --> 00:11:21,005
+The coursework 2 will then
+build on top of that.
+
+240
+00:11:21,005 --> 00:11:25,770
+So it's better to get this
+correct first. Thanks.