--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/videos/02-algo1.srt Tue Sep 29 12:52:07 2020 +0100
@@ -0,0 +1,2617 @@
+1
+00:00:05,880 --> 00:00:09,700
+Welcome back.
+Remember this slide.
+
+2
+00:00:09,700 --> 00:00:11,500
+This slide said, What is
+
+3
+00:00:11,500 --> 00:00:14,500
+all wreck expression Metro
+actually supposed to do?
+
+4
+00:00:14,500 --> 00:00:16,570
+It will take two arguments and
+
+5
+00:00:16,570 --> 00:00:18,670
+reg expression R and a string
+
+6
+00:00:18,670 --> 00:00:21,580
+S. And it's supposed to say yes,
+
+7
+00:00:21,580 --> 00:00:23,440
+the wreck expression matches
+
+8
+00:00:23,440 --> 00:00:26,920
+the string if and only
+if the string is in
+
+9
+00:00:26,920 --> 00:00:28,855
+the language of R.
+
+10
+00:00:28,855 --> 00:00:32,410
+And if the string is not
+in the language of our,
+
+11
+00:00:32,410 --> 00:00:35,515
+then our algorithm has to say no.
+
+12
+00:00:35,515 --> 00:00:37,210
+And we can't use
+
+13
+00:00:37,210 --> 00:00:39,565
+this specification
+directly because remember,
+
+14
+00:00:39,565 --> 00:00:43,305
+this l Sometimes
+produces infinite sets.
+
+15
+00:00:43,305 --> 00:00:47,585
+And so we can test whether a
+string is an infinite set,
+
+16
+00:00:47,585 --> 00:00:50,090
+at least not easily.
+
+17
+00:00:50,090 --> 00:00:52,340
+And so what we have to do
+
+18
+00:00:52,340 --> 00:00:54,260
+instead is we have
+to be a bit more
+
+19
+00:00:54,260 --> 00:00:57,050
+clever and implement
+some operations
+
+20
+00:00:57,050 --> 00:00:59,284
+on Rekha expressions instead.
+
+21
+00:00:59,284 --> 00:01:03,275
+Because Weka expressions
+are always finite trees.
+
+22
+00:01:03,275 --> 00:01:05,870
+I should say the
+person who has been
+
+23
+00:01:05,870 --> 00:01:08,150
+clever is called brush-off ski.
+
+24
+00:01:08,150 --> 00:01:11,435
+It's his, I've written,
+I'm introducing here.
+
+25
+00:01:11,435 --> 00:01:15,515
+And his algorithm consists
+of two functions.
+
+26
+00:01:15,515 --> 00:01:17,840
+One is called
+nullable and it takes
+
+27
+00:01:17,840 --> 00:01:20,104
+a regular expression as argument.
+
+28
+00:01:20,104 --> 00:01:24,155
+And the idea is that
+this function nullable.
+
+29
+00:01:24,155 --> 00:01:26,450
+Testing whether
+the reg expression
+
+30
+00:01:26,450 --> 00:01:27,950
+can match the empty string.
+
+31
+00:01:27,950 --> 00:01:30,305
+So 0 cannot match any string.
+
+32
+00:01:30,305 --> 00:01:33,275
+So it cannot match the
+empty string either.
+
+33
+00:01:33,275 --> 00:01:35,465
+So that's defined as false.
+
+34
+00:01:35,465 --> 00:01:37,775
+This reg expression,
+the whole purpose
+
+35
+00:01:37,775 --> 00:01:39,680
+is that it can match
+the empty string.
+
+36
+00:01:39,680 --> 00:01:41,645
+So it's defined as true.
+
+37
+00:01:41,645 --> 00:01:44,599
+If a reg expression
+can match a character,
+
+38
+00:01:44,599 --> 00:01:47,045
+then it cannot match
+the empty string.
+
+39
+00:01:47,045 --> 00:01:49,445
+So that is defined as false.
+
+40
+00:01:49,445 --> 00:01:53,180
+If an alternative can
+match the empty string,
+
+41
+00:01:53,180 --> 00:01:56,960
+then either or one can
+match the empty string.
+
+42
+00:01:56,960 --> 00:01:59,720
+Or R2 can match the empty string.
+
+43
+00:01:59,720 --> 00:02:03,110
+So either nullable
+of R1 has to hold,
+
+44
+00:02:03,110 --> 00:02:06,945
+or nullable of R2 has to hold.
+
+45
+00:02:06,945 --> 00:02:09,925
+In this sequence, it's
+the other way around.
+
+46
+00:02:09,925 --> 00:02:12,790
+If this reg expression can
+match the empty string,
+
+47
+00:02:12,790 --> 00:02:16,615
+then R1 has to be able to
+match the empty string,
+
+48
+00:02:16,615 --> 00:02:20,290
+and R2 has to be able to
+match the empty string.
+
+49
+00:02:20,290 --> 00:02:22,555
+So here it's just the opposite.
+
+50
+00:02:22,555 --> 00:02:25,660
+In one case it is o and
+the other case it's end.
+
+51
+00:02:25,660 --> 00:02:27,970
+In the store record
+expression can match
+
+52
+00:02:27,970 --> 00:02:30,445
+always the empty string.
+So that is true.
+
+53
+00:02:30,445 --> 00:02:33,340
+So this is a simple
+recursive function
+
+54
+00:02:33,340 --> 00:02:37,179
+and should not be too
+difficult to implement.
+
+55
+00:02:37,179 --> 00:02:42,025
+Okay, this nullable function
+that is easy-peasy.
+
+56
+00:02:42,025 --> 00:02:44,604
+The second function, however,
+
+57
+00:02:44,604 --> 00:02:49,155
+is a bit more involved and
+that's just to be expected.
+
+58
+00:02:49,155 --> 00:02:53,075
+Remember people working in
+this area already for decades.
+
+59
+00:02:53,075 --> 00:02:57,305
+If they have some problems
+with runtime, for example,
+
+60
+00:02:57,305 --> 00:02:58,940
+we can't expect that as
+
+61
+00:02:58,940 --> 00:03:03,260
+simple fix will solve all
+the problems in the world.
+
+62
+00:03:03,260 --> 00:03:06,530
+So I admit the second
+function is a bit more
+
+63
+00:03:06,530 --> 00:03:10,085
+complicated and make sure
+that you understand it.
+
+64
+00:03:10,085 --> 00:03:12,140
+And it also just
+chose this preserves
+
+65
+00:03:12,140 --> 00:03:14,345
+the is a very clever guy.
+
+66
+00:03:14,345 --> 00:03:15,800
+Actually, I have to say he was
+
+67
+00:03:15,800 --> 00:03:17,720
+a clever guy because
+I think he either
+
+68
+00:03:17,720 --> 00:03:21,650
+died last year or
+the year before.
+
+69
+00:03:21,650 --> 00:03:25,505
+And he came up with this
+algorithm already in 1964.
+
+70
+00:03:25,505 --> 00:03:27,260
+It somehow got lost,
+
+71
+00:03:27,260 --> 00:03:30,305
+but has been rediscovered
+in the last ten years.
+
+72
+00:03:30,305 --> 00:03:34,685
+So the idea of the second
+function is the following.
+
+73
+00:03:34,685 --> 00:03:38,120
+Imagine you have a
+reexpression and it can
+
+74
+00:03:38,120 --> 00:03:41,930
+match a string of the
+form C followed by as.
+
+75
+00:03:41,930 --> 00:03:44,810
+So the C is the first
+character of that string.
+
+76
+00:03:44,810 --> 00:03:48,605
+So I mentioned that can
+match this kind of string.
+
+77
+00:03:48,605 --> 00:03:50,330
+The question is now,
+
+78
+00:03:50,330 --> 00:03:52,400
+what would a wrecker
+expression look
+
+79
+00:03:52,400 --> 00:03:54,695
+like that can match chest
+
+80
+00:03:54,695 --> 00:03:57,140
+s. You probably remember
+
+81
+00:03:57,140 --> 00:03:59,300
+that from the semantic
+that every relative,
+
+82
+00:03:59,300 --> 00:04:00,860
+there was also
+something of chopping
+
+83
+00:04:00,860 --> 00:04:02,210
+off the first character.
+
+84
+00:04:02,210 --> 00:04:04,940
+Here it's the same.
+If a reg expression
+
+85
+00:04:04,940 --> 00:04:07,835
+can match a string
+starting with a C,
+
+86
+00:04:07,835 --> 00:04:11,090
+we're looking for a wreck
+expression which can match
+
+87
+00:04:11,090 --> 00:04:15,215
+the rest of the string where
+the c has been chopped off.
+
+88
+00:04:15,215 --> 00:04:17,690
+And this operation will be called
+
+89
+00:04:17,690 --> 00:04:21,049
+the derivative of a
+wreck expression.
+
+90
+00:04:21,049 --> 00:04:22,205
+And it will also take
+
+91
+00:04:22,205 --> 00:04:25,460
+a character as argument
+and the rec expression.
+
+92
+00:04:25,460 --> 00:04:28,730
+And in contrast to
+the semantic records,
+
+93
+00:04:28,730 --> 00:04:31,310
+semantic derivative, which works
+
+94
+00:04:31,310 --> 00:04:34,430
+over languages or
+sets of strings.
+
+95
+00:04:34,430 --> 00:04:39,620
+This derivative works
+over regular expressions.
+
+96
+00:04:39,620 --> 00:04:41,330
+Okay, here's this function.
+
+97
+00:04:41,330 --> 00:04:43,970
+It's defined recursively over
+
+98
+00:04:43,970 --> 00:04:46,370
+the structure of rec expressions.
+
+99
+00:04:46,370 --> 00:04:48,814
+The first argument
+is the character,
+
+100
+00:04:48,814 --> 00:04:52,700
+and the second one is
+a wreck expression.
+
+101
+00:04:52,700 --> 00:04:56,510
+And remember the idea
+is we're looking for
+
+102
+00:04:56,510 --> 00:05:00,680
+a wreck expression that
+can match everything.
+
+103
+00:05:00,680 --> 00:05:03,125
+This reg expression
+as argument can match
+
+104
+00:05:03,125 --> 00:05:07,040
+except for the C. So now let's
+look at this first case.
+
+105
+00:05:07,040 --> 00:05:10,550
+So this reg expression
+cannot match any string.
+
+106
+00:05:10,550 --> 00:05:14,270
+So it certainly cannot
+match any string starting
+
+107
+00:05:14,270 --> 00:05:16,910
+a C. So we have to look
+
+108
+00:05:16,910 --> 00:05:20,090
+for and reg expression which
+can not much anything.
+
+109
+00:05:20,090 --> 00:05:22,700
+Well, that's the purpose
+of this record expression,
+
+110
+00:05:22,700 --> 00:05:24,815
+so we define it as 0.
+
+111
+00:05:24,815 --> 00:05:28,130
+In the next case,
+this reg expression
+
+112
+00:05:28,130 --> 00:05:30,440
+can match the empty string,
+
+113
+00:05:30,440 --> 00:05:33,440
+but it cannot match
+any string that starts
+
+114
+00:05:33,440 --> 00:05:36,350
+with C. So also in this case,
+
+115
+00:05:36,350 --> 00:05:39,560
+we just define it as
+the bracket question,
+
+116
+00:05:39,560 --> 00:05:41,615
+which cannot match anything.
+
+117
+00:05:41,615 --> 00:05:45,170
+In the next case, this
+C as the argument to
+
+118
+00:05:45,170 --> 00:05:48,335
+the derivative and this d
+is the Rekha expression.
+
+119
+00:05:48,335 --> 00:05:50,225
+So now there are two cases.
+
+120
+00:05:50,225 --> 00:05:53,525
+If this C and this D
+is actually equal.
+
+121
+00:05:53,525 --> 00:05:55,970
+That means this record
+expression can match
+
+122
+00:05:55,970 --> 00:05:59,240
+a string which just contains C0.
+
+123
+00:05:59,240 --> 00:06:01,505
+So if we strip that off,
+
+124
+00:06:01,505 --> 00:06:04,790
+motor remains is
+the empty string.
+
+125
+00:06:04,790 --> 00:06:06,890
+What is a regular expression
+
+126
+00:06:06,890 --> 00:06:09,170
+look like that can
+match the empty string.
+
+127
+00:06:09,170 --> 00:06:11,915
+Well, that's the
+purpose of the warm.
+
+128
+00:06:11,915 --> 00:06:15,440
+And if c is not equal to d,
+
+129
+00:06:15,440 --> 00:06:17,630
+then this reg expression
+
+130
+00:06:17,630 --> 00:06:19,220
+cannot match anything
+which starts with
+
+131
+00:06:19,220 --> 00:06:23,780
+a C. So again it will
+be defined as just 0.
+
+132
+00:06:23,780 --> 00:06:29,390
+Okay? Now, the alternative case,
+
+133
+00:06:29,390 --> 00:06:31,970
+if this reg expression can
+
+134
+00:06:31,970 --> 00:06:36,050
+match the strings
+starting with C,
+
+135
+00:06:36,050 --> 00:06:40,820
+then it can either be
+matched by the Ongwen.
+
+136
+00:06:40,820 --> 00:06:44,495
+Or it can be much by the R2.
+
+137
+00:06:44,495 --> 00:06:50,090
+If they are one can match C
+and then following a string,
+
+138
+00:06:50,090 --> 00:06:53,975
+then we just have to recursively
+call the derivative for
+
+139
+00:06:53,975 --> 00:06:56,570
+R to get a reg expression
+
+140
+00:06:56,570 --> 00:06:59,135
+that can match the
+rest of the string.
+
+141
+00:06:59,135 --> 00:07:02,690
+And the same we can do with R2.
+
+142
+00:07:02,690 --> 00:07:06,110
+You can find a reg expression
+which can match everything.
+
+143
+00:07:06,110 --> 00:07:07,850
+This R2 can match,
+
+144
+00:07:07,850 --> 00:07:09,710
+starting with C, bad,
+
+145
+00:07:09,710 --> 00:07:12,590
+which chopping off this C. Okay?
+
+146
+00:07:12,590 --> 00:07:16,370
+So now if you have to find
+the break expression,
+
+147
+00:07:16,370 --> 00:07:20,030
+which can match all the strings
+where C is tripped off.
+
+148
+00:07:20,030 --> 00:07:22,295
+Then we just have to
+take the alternatives
+
+149
+00:07:22,295 --> 00:07:24,965
+of these two derivatives.
+
+150
+00:07:24,965 --> 00:07:29,390
+Okay? Now to sequence case,
+
+151
+00:07:29,390 --> 00:07:33,005
+this sequence case is the
+most complicated one.
+
+152
+00:07:33,005 --> 00:07:35,180
+And let's look first at
+
+153
+00:07:35,180 --> 00:07:39,335
+the second case where
+the Earth's brush.
+
+154
+00:07:39,335 --> 00:07:42,830
+Okay? So if this
+regular expression
+
+155
+00:07:42,830 --> 00:07:46,145
+can match a string
+starting with C,
+
+156
+00:07:46,145 --> 00:07:48,155
+then the following
+must have happened.
+
+157
+00:07:48,155 --> 00:07:51,905
+First, the R1 must have matched
+a string starting with C
+
+158
+00:07:51,905 --> 00:07:56,420
+and then anything coming
+afterwards with r2.
+
+159
+00:07:56,420 --> 00:07:59,660
+Okay? So in this case I only
+
+160
+00:07:59,660 --> 00:08:03,260
+have to call this
+derivative for this r one.
+
+161
+00:08:03,260 --> 00:08:06,830
+And find the reg expression
+which can match everything
+
+162
+00:08:06,830 --> 00:08:11,555
+this R one can match except
+for this C chopped off.
+
+163
+00:08:11,555 --> 00:08:15,830
+And I have to build that
+sequence derivative of that.
+
+164
+00:08:15,830 --> 00:08:18,260
+So that's what the As per se.
+
+165
+00:08:18,260 --> 00:08:21,860
+So I take the
+derivative of R1 and I
+
+166
+00:08:21,860 --> 00:08:23,480
+put the R2 on
+
+167
+00:08:23,480 --> 00:08:25,865
+the back because that's
+the rest of the string.
+
+168
+00:08:25,865 --> 00:08:29,240
+Ok? So that's the only
+case we have to consider
+
+169
+00:08:29,240 --> 00:08:32,750
+this sequence case
+except if not the,
+
+170
+00:08:32,750 --> 00:08:37,895
+how one can match the
+sea and something else.
+
+171
+00:08:37,895 --> 00:08:42,965
+But if on mismatching
+actually the empty string,
+
+172
+00:08:42,965 --> 00:08:48,890
+this case actually the R two
+is in charge of matching
+
+173
+00:08:48,890 --> 00:08:51,590
+the string starting with C. So in
+
+174
+00:08:51,590 --> 00:08:55,490
+this case we have to call
+the derivative for R2.
+
+175
+00:08:55,490 --> 00:08:57,875
+So that's why we have
+these two cases.
+
+176
+00:08:57,875 --> 00:09:00,455
+So if R1 is nullable,
+
+177
+00:09:00,455 --> 00:09:03,245
+then it can match
+the empty string.
+
+178
+00:09:03,245 --> 00:09:05,330
+And we have to
+consider the case that
+
+179
+00:09:05,330 --> 00:09:08,045
+this R2 is matching a string
+
+180
+00:09:08,045 --> 00:09:10,700
+starting with C. And so we have
+
+181
+00:09:10,700 --> 00:09:14,210
+to call the derivative
+for this R2 in this case.
+
+182
+00:09:14,210 --> 00:09:18,680
+Otherwise, the R1 will
+be in charge of matching
+
+183
+00:09:18,680 --> 00:09:20,840
+a part of that
+string starting with
+
+184
+00:09:20,840 --> 00:09:24,695
+C. And I have to call the
+derivative on this R1.
+
+185
+00:09:24,695 --> 00:09:30,670
+Okay? I hope this makes sense.
+
+186
+00:09:30,670 --> 00:09:34,150
+Go over that and
+also the handouts.
+
+187
+00:09:34,150 --> 00:09:37,465
+Again. With that, that's
+it's really subtle.
+
+188
+00:09:37,465 --> 00:09:40,945
+And how do I remember this case?
+
+189
+00:09:40,945 --> 00:09:42,430
+Well, I know it's
+
+190
+00:09:42,430 --> 00:09:45,310
+an if condition and the
+condition is nullable,
+
+191
+00:09:45,310 --> 00:09:48,160
+then I will always remember
+
+192
+00:09:48,160 --> 00:09:53,275
+the else branch where pushes
+NG derivative over the R1.
+
+193
+00:09:53,275 --> 00:09:55,780
+So are usually write
+down the S punch
+
+194
+00:09:55,780 --> 00:09:59,650
+first and then construct
+the thin branch by saying,
+
+195
+00:09:59,650 --> 00:10:01,525
+well, I just repeat this.
+
+196
+00:10:01,525 --> 00:10:03,760
+And I have to remember
+in that case I
+
+197
+00:10:03,760 --> 00:10:06,580
+have to build also
+derivative of R2.
+
+198
+00:10:06,580 --> 00:10:12,695
+Okay? Finally, the star case.
+
+199
+00:10:12,695 --> 00:10:15,665
+Ok. So here again
+we're looking for
+
+200
+00:10:15,665 --> 00:10:17,300
+a reg expression which
+
+201
+00:10:17,300 --> 00:10:19,745
+can match the string
+starting with C,
+
+202
+00:10:19,745 --> 00:10:22,355
+except that the c has
+been chopped off.
+
+203
+00:10:22,355 --> 00:10:28,640
+So if r star has to match
+a string starting with C,
+
+204
+00:10:28,640 --> 00:10:32,735
+then at least we need one
+copy of this r. Okay?
+
+205
+00:10:32,735 --> 00:10:34,310
+So at least one copy of
+
+206
+00:10:34,310 --> 00:10:37,010
+this r has matched
+something which starts with
+
+207
+00:10:37,010 --> 00:10:38,870
+a C and then afterwards
+
+208
+00:10:38,870 --> 00:10:41,570
+come 0 more copies
+of this OnStar.
+
+209
+00:10:41,570 --> 00:10:45,530
+Okay? What we do there
+is we trying to find
+
+210
+00:10:45,530 --> 00:10:47,960
+the Rekha expression
+which can match
+
+211
+00:10:47,960 --> 00:10:50,915
+this first part of the string.
+
+212
+00:10:50,915 --> 00:10:53,255
+However, where the
+sea is chopped off.
+
+213
+00:10:53,255 --> 00:10:55,130
+And then we just say, well,
+
+214
+00:10:55,130 --> 00:10:57,050
+the rest has to be
+matched again with
+
+215
+00:10:57,050 --> 00:10:59,030
+0 or more copies of
+
+216
+00:10:59,030 --> 00:11:02,600
+this R. So that's why
+it's defined like this.
+
+217
+00:11:02,600 --> 00:11:09,050
+Okay? S8. Please take care
+with this definition.
+
+218
+00:11:09,050 --> 00:11:11,435
+That's not so simple.
+
+219
+00:11:11,435 --> 00:11:13,250
+Once you get a hang of it,
+
+220
+00:11:13,250 --> 00:11:15,170
+however, it makes perfect sense.
+
+221
+00:11:15,170 --> 00:11:17,210
+So let me explain it in
+
+222
+00:11:17,210 --> 00:11:20,825
+different ways in
+the next slides.
+
+223
+00:11:20,825 --> 00:11:24,695
+Okay, let's look
+first some examples.
+
+224
+00:11:24,695 --> 00:11:27,140
+So here is a regular expression,
+
+225
+00:11:27,140 --> 00:11:29,390
+R. And let's have
+
+226
+00:11:29,390 --> 00:11:32,450
+a look at these three
+derivatives according to a,
+
+227
+00:11:32,450 --> 00:11:38,405
+B, and C. And Vishal do
+with d1 for the a. Ok.
+
+228
+00:11:38,405 --> 00:11:42,379
+So here is our reg expression
+
+229
+00:11:42,379 --> 00:11:45,334
+and was very generous
+with dependent a thesis.
+
+230
+00:11:45,334 --> 00:11:48,140
+And the outermost is a star.
+
+231
+00:11:48,140 --> 00:11:52,550
+So if people now the
+derivative according to a,
+
+232
+00:11:52,550 --> 00:11:55,474
+the character a of
+that wreck expression.
+
+233
+00:11:55,474 --> 00:11:57,380
+Okay? So the first thing we
+
+234
+00:11:57,380 --> 00:11:59,555
+have to analyze is the K star.
+
+235
+00:11:59,555 --> 00:12:04,370
+Ok? So here's direct expression,
+which we are looking at.
+
+236
+00:12:04,370 --> 00:12:09,170
+This are the outermost
+constructor is this star.
+
+237
+00:12:09,170 --> 00:12:11,510
+If you go back to the definition,
+
+238
+00:12:11,510 --> 00:12:13,625
+I hope you have it next to you,
+
+239
+00:12:13,625 --> 00:12:16,340
+then this star case is defined
+
+240
+00:12:16,340 --> 00:12:20,000
+as u taking just the
+inside of the star
+
+241
+00:12:20,000 --> 00:12:23,030
+and apply this derivative and
+
+242
+00:12:23,030 --> 00:12:26,765
+leave the are on the
+outside at the end.
+
+243
+00:12:26,765 --> 00:12:29,990
+Ok. So that's the first step.
+
+244
+00:12:29,990 --> 00:12:32,030
+Now we have to analyze
+
+245
+00:12:32,030 --> 00:12:36,035
+the derivative according to
+a of this record expression,
+
+246
+00:12:36,035 --> 00:12:38,000
+which is an alternative.
+
+247
+00:12:38,000 --> 00:12:39,665
+So the outermost is a plus.
+
+248
+00:12:39,665 --> 00:12:41,375
+Ok, that's very easy again,
+
+249
+00:12:41,375 --> 00:12:45,185
+we just have to push the
+derivative into each component,
+
+250
+00:12:45,185 --> 00:12:47,705
+into the a, followed by b.
+
+251
+00:12:47,705 --> 00:12:49,145
+And in this segment,
+
+252
+00:12:49,145 --> 00:12:51,185
+alternative into b. Ok,
+
+253
+00:12:51,185 --> 00:12:56,030
+so we take the derivative
+of each according to a way.
+
+254
+00:12:56,030 --> 00:13:00,635
+Now this one is a sequence
+break expression.
+
+255
+00:13:00,635 --> 00:13:02,210
+This most complicated case.
+
+256
+00:13:02,210 --> 00:13:04,160
+So the first of all
+you have to test is,
+
+257
+00:13:04,160 --> 00:13:07,910
+is the first component
+nullable of this sequence?
+
+258
+00:13:07,910 --> 00:13:09,200
+Well, that is a,
+
+259
+00:13:09,200 --> 00:13:12,740
+in this case, a on its
+own is not nullable.
+
+260
+00:13:12,740 --> 00:13:14,210
+So vn, the easy case,
+
+261
+00:13:14,210 --> 00:13:17,000
+we only have a single derivative
+
+262
+00:13:17,000 --> 00:13:19,370
+pushed in the first component.
+
+263
+00:13:19,370 --> 00:13:25,160
+So we have the derivative
+of a with the character a.
+
+264
+00:13:25,160 --> 00:13:27,920
+Okay, that's now
+the character case.
+
+265
+00:13:27,920 --> 00:13:29,720
+And in this case the character
+
+266
+00:13:29,720 --> 00:13:31,715
+in the regular expression agree.
+
+267
+00:13:31,715 --> 00:13:33,890
+So it's defined as one.
+
+268
+00:13:33,890 --> 00:13:37,550
+Ok? In the other case,
+
+269
+00:13:37,550 --> 00:13:39,710
+the break expression is P,
+
+270
+00:13:39,710 --> 00:13:41,675
+But the characters a.
+
+271
+00:13:41,675 --> 00:13:46,385
+So in this case
+it's defined as 0.
+
+272
+00:13:46,385 --> 00:13:50,630
+Okay? So that's what the
+derivative would be.
+
+273
+00:13:50,630 --> 00:13:52,160
+This r is there
+
+274
+00:13:52,160 --> 00:13:55,280
+because originally we
+started with a star.
+
+275
+00:13:55,280 --> 00:13:58,295
+This expression is that
+star at expression.
+
+276
+00:13:58,295 --> 00:14:02,780
+Ok? So the derivative
+according to a
+
+277
+00:14:02,780 --> 00:14:07,610
+up that reg expression
+is this expression.
+
+278
+00:14:07,610 --> 00:14:10,970
+We just have to
+substitute this back in.
+
+279
+00:14:10,970 --> 00:14:13,745
+Just coming back to this slide.
+
+280
+00:14:13,745 --> 00:14:16,160
+So far, they're only analyze
+
+281
+00:14:16,160 --> 00:14:19,505
+the derivative according
+to a single character.
+
+282
+00:14:19,505 --> 00:14:23,960
+But we can also very easily
+extend that to whole strings.
+
+283
+00:14:23,960 --> 00:14:26,360
+So if you build the
+derivative according
+
+284
+00:14:26,360 --> 00:14:27,905
+to the empty string,
+
+285
+00:14:27,905 --> 00:14:30,875
+we just return the
+Rekha expression.
+
+286
+00:14:30,875 --> 00:14:35,585
+If we have a string
+starting with character c,
+
+287
+00:14:35,585 --> 00:14:37,850
+remember that can
+be any character.
+
+288
+00:14:37,850 --> 00:14:42,170
+Then we build first the simple
+derivative according to
+
+289
+00:14:42,170 --> 00:14:44,360
+that first character and
+
+290
+00:14:44,360 --> 00:14:46,925
+continue with the
+rest of the string.
+
+291
+00:14:46,925 --> 00:14:50,615
+So here you see again,
+my personal convention.
+
+292
+00:14:50,615 --> 00:14:54,365
+Everything which works on
+lists has this S at the end.
+
+293
+00:14:54,365 --> 00:14:57,125
+So this function is
+for single characters.
+
+294
+00:14:57,125 --> 00:14:59,179
+This one is for strings,
+
+295
+00:14:59,179 --> 00:15:02,450
+but it uses the one
+for the character.
+
+296
+00:15:02,450 --> 00:15:04,025
+Essentially what it does is
+
+297
+00:15:04,025 --> 00:15:06,185
+it chops off the first character,
+
+298
+00:15:06,185 --> 00:15:09,800
+builds the derivative, then
+chops off the next character,
+
+299
+00:15:09,800 --> 00:15:13,760
+builds the derivative of
+the result, and so on.
+
+300
+00:15:13,760 --> 00:15:17,000
+Having this function,
+we can actually now
+
+301
+00:15:17,000 --> 00:15:20,600
+state what the algorithm
+is, the complete algorithm.
+
+302
+00:15:20,600 --> 00:15:23,465
+So the pro shops ki mantra
+
+303
+00:15:23,465 --> 00:15:24,860
+takes a regular expression as
+
+304
+00:15:24,860 --> 00:15:26,915
+argument and a
+string as argument.
+
+305
+00:15:26,915 --> 00:15:30,920
+And is supposed to say yes if
+the reg expression matches
+
+306
+00:15:30,920 --> 00:15:33,560
+the string or No
+
+307
+00:15:33,560 --> 00:15:36,065
+if the reg expression does
+not match the string.
+
+308
+00:15:36,065 --> 00:15:37,715
+And how does it do that?
+
+309
+00:15:37,715 --> 00:15:42,560
+Well, it takes this string
+s And this reg expression,
+
+310
+00:15:42,560 --> 00:15:43,925
+and it first built
+
+311
+00:15:43,925 --> 00:15:48,845
+successive derivatives until
+that string is exhaust that.
+
+312
+00:15:48,845 --> 00:15:52,115
+Okay? Then you have
+a final derivative,
+
+313
+00:15:52,115 --> 00:15:53,839
+a final reg expression.
+
+314
+00:15:53,839 --> 00:15:55,370
+And you test whether
+
+315
+00:15:55,370 --> 00:15:57,920
+this reg expression can
+match the empty string.
+
+316
+00:15:57,920 --> 00:16:01,370
+If yes, then the
+original reg expression
+
+317
+00:16:01,370 --> 00:16:03,245
+is r can match the string.
+
+318
+00:16:03,245 --> 00:16:05,210
+If no, if it cannot match
+
+319
+00:16:05,210 --> 00:16:08,030
+the final derivative
+with the empty string,
+
+320
+00:16:08,030 --> 00:16:10,280
+then know this
+regular expression,
+
+321
+00:16:10,280 --> 00:16:12,905
+R cannot match that string.
+
+322
+00:16:12,905 --> 00:16:16,010
+I know it looks
+very anticlimactic,
+
+323
+00:16:16,010 --> 00:16:19,625
+but that's actually the
+beauty of this algorithm,
+
+324
+00:16:19,625 --> 00:16:22,760
+that it's not that complicated.
+
+325
+00:16:22,760 --> 00:16:25,340
+So how does the algorithm work?
+
+326
+00:16:25,340 --> 00:16:27,634
+In a concrete example?
+
+327
+00:16:27,634 --> 00:16:31,580
+Imagine you have a string, abc
+
+328
+00:16:31,580 --> 00:16:34,370
+and you have a break
+expression, say R1.
+
+329
+00:16:34,370 --> 00:16:37,520
+And you want to find out
+whether this or one can match
+
+330
+00:16:37,520 --> 00:16:41,300
+that string abc or not.
+How do you do that?
+
+331
+00:16:41,300 --> 00:16:45,140
+Well, you would first
+take this R1 and you
+
+332
+00:16:45,140 --> 00:16:47,150
+build the derivative according
+
+333
+00:16:47,150 --> 00:16:49,880
+to the first character, D-A.
+
+334
+00:16:49,880 --> 00:16:53,015
+Okay? You get the derivative,
+
+335
+00:16:53,015 --> 00:16:55,294
+which I call here R2.
+
+336
+00:16:55,294 --> 00:16:58,640
+Then you take the next
+character, the B.
+
+337
+00:16:58,640 --> 00:17:04,535
+You now build the derivative
+according to b of this R2.
+
+338
+00:17:04,535 --> 00:17:07,460
+Okay? So you take the
+result of the first step,
+
+339
+00:17:07,460 --> 00:17:09,530
+you feed it into the second step,
+
+340
+00:17:09,530 --> 00:17:11,810
+and you take the
+second character.
+
+341
+00:17:11,810 --> 00:17:17,075
+Then you do this also with c.
+So you get a derivative R3,
+
+342
+00:17:17,075 --> 00:17:22,460
+and you build the derivative
+of R three according to c,
+
+343
+00:17:22,460 --> 00:17:24,185
+you get an R four.
+
+344
+00:17:24,185 --> 00:17:26,300
+Okay, so that's the
+final derivative.
+
+345
+00:17:26,300 --> 00:17:27,530
+The string is exhausted.
+
+346
+00:17:27,530 --> 00:17:29,570
+We build derivatives
+according to a, B,
+
+347
+00:17:29,570 --> 00:17:34,610
+and C. Now we just test whether
+this r four is nullable.
+
+348
+00:17:34,610 --> 00:17:37,175
+If it says yes,
+
+349
+00:17:37,175 --> 00:17:41,510
+then df break expression metro
+will just say true, yes,
+
+350
+00:17:41,510 --> 00:17:43,340
+this original reg expression,
+
+351
+00:17:43,340 --> 00:17:47,270
+the R1, will be able to
+match that string abc.
+
+352
+00:17:47,270 --> 00:17:50,585
+And if this test returns false,
+
+353
+00:17:50,585 --> 00:17:53,015
+then the algorithm says false.
+
+354
+00:17:53,015 --> 00:17:56,975
+This reg expression will
+not match the string.
+
+355
+00:17:56,975 --> 00:18:00,260
+Ok, you might ask
+why on earth does
+
+356
+00:18:00,260 --> 00:18:02,960
+that algorithm
+actually work away?
+
+357
+00:18:02,960 --> 00:18:06,515
+Here's an explanation
+for why it works.
+
+358
+00:18:06,515 --> 00:18:10,190
+Imagine you have a wreck
+expression R1, okay?
+
+359
+00:18:10,190 --> 00:18:13,220
+And you have a string abc,
+
+360
+00:18:13,220 --> 00:18:14,270
+and you want to find out
+
+361
+00:18:14,270 --> 00:18:17,180
+whether one can
+match that string.
+
+362
+00:18:17,180 --> 00:18:18,799
+And for the moment,
+
+363
+00:18:18,799 --> 00:18:22,610
+let's assume that it
+can match that string.
+
+364
+00:18:22,610 --> 00:18:26,315
+Ok? So the language L of
+
+365
+00:18:26,315 --> 00:18:30,185
+R will actually
+contain that string,
+
+366
+00:18:30,185 --> 00:18:31,805
+otherwise it wouldn't match that.
+
+367
+00:18:31,805 --> 00:18:36,710
+Okay? So ABC is in
+this language, okay?
+
+368
+00:18:36,710 --> 00:18:39,785
+If I now take the
+semantic derivative,
+
+369
+00:18:39,785 --> 00:18:43,145
+that means I look at all
+the strings in this f,
+
+370
+00:18:43,145 --> 00:18:46,820
+R1, and further out
+
+371
+00:18:46,820 --> 00:18:48,740
+all the ones which
+do not start with
+
+372
+00:18:48,740 --> 00:18:51,260
+an a, I discharge them.
+
+373
+00:18:51,260 --> 00:18:54,545
+And I only look the one
+which start with an a.
+
+374
+00:18:54,545 --> 00:18:56,465
+And of those strings,
+
+375
+00:18:56,465 --> 00:18:58,475
+I chop off this a.
+
+376
+00:18:58,475 --> 00:19:01,025
+So after this
+romantic derivative,
+
+377
+00:19:01,025 --> 00:19:05,735
+this set of strings will
+contain just B and C.
+
+378
+00:19:05,735 --> 00:19:12,830
+Ok. Now if I build the next
+semantic derivative of that,
+
+379
+00:19:12,830 --> 00:19:14,345
+then I would look at
+
+380
+00:19:14,345 --> 00:19:16,850
+all the strings which
+start with a P,
+
+381
+00:19:16,850 --> 00:19:21,350
+and forget about everything
+else of the ones.
+
+382
+00:19:21,350 --> 00:19:27,905
+I know they start with B.
+I just chop of the B. Ok.
+
+383
+00:19:27,905 --> 00:19:31,655
+So in this whole set here,
+
+384
+00:19:31,655 --> 00:19:33,785
+in this whole set here,
+
+385
+00:19:33,785 --> 00:19:39,030
+there will be now a string
+which is just c. Okay?
+
+386
+00:19:39,190 --> 00:19:44,420
+Then I built the third
+semantic derivative
+
+387
+00:19:44,420 --> 00:19:47,300
+because I want to find out
+whether ABC is involved.
+
+388
+00:19:47,300 --> 00:19:50,540
+Okay? So now I look
+at all the strings in
+
+389
+00:19:50,540 --> 00:19:52,820
+here and look at
+
+390
+00:19:52,820 --> 00:19:55,340
+them whether they start
+with a C. If yes,
+
+391
+00:19:55,340 --> 00:19:56,885
+I chop off the sea.
+
+392
+00:19:56,885 --> 00:19:59,120
+And put in markets remaining.
+
+393
+00:19:59,120 --> 00:20:00,425
+So in this case,
+
+394
+00:20:00,425 --> 00:20:02,510
+if I have the string c
+
+395
+00:20:02,510 --> 00:20:04,550
+in this language and
+I chop off this,
+
+396
+00:20:04,550 --> 00:20:07,700
+see what is remaining
+is the empty string.
+
+397
+00:20:07,700 --> 00:20:09,695
+So we have to check of
+
+398
+00:20:09,695 --> 00:20:14,510
+that language whether it
+contains the empty string.
+
+399
+00:20:14,510 --> 00:20:18,800
+If yes, then the
+original R1 can match
+
+400
+00:20:18,800 --> 00:20:21,050
+this ABC because this ABC
+
+401
+00:20:21,050 --> 00:20:24,119
+must have been in this language.
+
+402
+00:20:24,130 --> 00:20:28,565
+And if in the end there wasn't
+the empty string, then,
+
+403
+00:20:28,565 --> 00:20:33,575
+then this ABC Watson in
+this language of one.
+
+404
+00:20:33,575 --> 00:20:36,260
+And so the electron must have,
+
+405
+00:20:36,260 --> 00:20:38,880
+or the metro must have failed.
+
+406
+00:20:39,040 --> 00:20:42,530
+The clever bit is that here
+
+407
+00:20:42,530 --> 00:20:45,530
+the explanation is for languages.
+
+408
+00:20:45,530 --> 00:20:49,835
+Remember, this
+semantic derivative
+
+409
+00:20:49,835 --> 00:20:53,450
+works over languages and they
+sometimes can be in finite.
+
+410
+00:20:53,450 --> 00:20:55,730
+So that's not really
+an algorithm.
+
+411
+00:20:55,730 --> 00:20:58,880
+Yeah, that's just
+explaining the idea with
+
+412
+00:20:58,880 --> 00:21:02,525
+preserves key
+achieved was that he
+
+413
+00:21:02,525 --> 00:21:06,440
+now works with this derivative
+America expressions and
+
+414
+00:21:06,440 --> 00:21:10,715
+somehow imitates what
+happens on these languages.
+
+415
+00:21:10,715 --> 00:21:14,135
+Because remember if you
+have an wreck expression,
+
+416
+00:21:14,135 --> 00:21:17,405
+are you want to test
+whether can match APC,
+
+417
+00:21:17,405 --> 00:21:22,550
+then you take first
+derivative according to a.
+
+418
+00:21:22,550 --> 00:21:25,760
+So you will get a wreck
+expression which can match b
+
+419
+00:21:25,760 --> 00:21:29,464
+and c If R could match abc.
+
+420
+00:21:29,464 --> 00:21:31,430
+So after the first derivative,
+
+421
+00:21:31,430 --> 00:21:33,620
+you will get a wreck expression
+which can match B and
+
+422
+00:21:33,620 --> 00:21:37,070
+C. If you take the
+second derivative,
+
+423
+00:21:37,070 --> 00:21:41,225
+you will get a reexpression
+which can match c alone.
+
+424
+00:21:41,225 --> 00:21:44,180
+And if you take the
+final derivative,
+
+425
+00:21:44,180 --> 00:21:46,070
+then you will get
+
+426
+00:21:46,070 --> 00:21:48,260
+rec expression which hopefully
+
+427
+00:21:48,260 --> 00:21:49,715
+can match the empty string.
+
+428
+00:21:49,715 --> 00:21:53,780
+If it does, then this
+R can match the ABC.
+
+429
+00:21:53,780 --> 00:21:55,655
+And if it doesn't, then
+
+430
+00:21:55,655 --> 00:21:58,680
+ABC couldn't be
+matched by this on.
+
+431
+00:21:58,900 --> 00:22:02,990
+Okay, let's have a look
+how this pans out in code.
+
+432
+00:22:02,990 --> 00:22:06,050
+Here's defile RE1.
+
+433
+00:22:06,050 --> 00:22:07,940
+It's also uploaded on Keith,
+
+434
+00:22:07,940 --> 00:22:10,625
+so you can see exactly
+what I'm doing.
+
+435
+00:22:10,625 --> 00:22:13,970
+And actually I already saw
+that file because I showed you
+
+436
+00:22:13,970 --> 00:22:15,710
+how my wreck expressions are
+
+437
+00:22:15,710 --> 00:22:17,960
+defined with the
+abstract classes here.
+
+438
+00:22:17,960 --> 00:22:21,155
+And here, the six cases
+for 0-1 character,
+
+439
+00:22:21,155 --> 00:22:23,540
+I turn a TIF in
+sequence and star.
+
+440
+00:22:23,540 --> 00:22:26,705
+Ok. So the first
+function nullable,
+
+441
+00:22:26,705 --> 00:22:28,760
+the simple one, takes
+
+442
+00:22:28,760 --> 00:22:32,120
+a regular expression as
+argument and returns a boolean.
+
+443
+00:22:32,120 --> 00:22:34,280
+And then with this
+pattern matching,
+
+444
+00:22:34,280 --> 00:22:37,040
+we just go through
+all these six cases
+
+445
+00:22:37,040 --> 00:22:38,900
+are serious defined as false.
+
+446
+00:22:38,900 --> 00:22:43,234
+One is defined as true
+character for any character,
+
+447
+00:22:43,234 --> 00:22:45,455
+this null return false.
+
+448
+00:22:45,455 --> 00:22:47,540
+The alternative is to find here,
+
+449
+00:22:47,540 --> 00:22:50,000
+so that's the or in Scala.
+
+450
+00:22:50,000 --> 00:22:52,700
+And for the sequence,
+that's the end.
+
+451
+00:22:52,700 --> 00:22:56,180
+And this star, no matter
+what the reg expression is,
+
+452
+00:22:56,180 --> 00:22:59,540
+it will always match the
+empty string, so true.
+
+453
+00:22:59,540 --> 00:23:02,225
+So nanobots, very easy.
+
+454
+00:23:02,225 --> 00:23:07,430
+The derivative is also not
+so much more complicated.
+
+455
+00:23:07,430 --> 00:23:08,974
+It takes two arguments,
+
+456
+00:23:08,974 --> 00:23:11,810
+a character and the
+rec expression,
+
+457
+00:23:11,810 --> 00:23:14,405
+and returns a wreck expression.
+
+458
+00:23:14,405 --> 00:23:16,340
+So now we just have to analyze
+
+459
+00:23:16,340 --> 00:23:18,890
+what's that reg
+expression looks like.
+
+460
+00:23:18,890 --> 00:23:23,870
+If it's 0, we return
+01, we return 0.
+
+461
+00:23:23,870 --> 00:23:26,990
+If the character is the
+wreck expressions character
+
+462
+00:23:26,990 --> 00:23:30,260
+d. We test whether it's
+equal to this character.
+
+463
+00:23:30,260 --> 00:23:32,225
+We want to take the
+derivative off.
+
+464
+00:23:32,225 --> 00:23:36,185
+If yes, return one, otherwise 0.
+
+465
+00:23:36,185 --> 00:23:38,600
+The alternative which has pushed
+
+466
+00:23:38,600 --> 00:23:39,860
+the derivative under
+
+467
+00:23:39,860 --> 00:23:42,710
+this alternative,
+that's the easy one.
+
+468
+00:23:42,710 --> 00:23:44,690
+Here's the sequence case where we
+
+469
+00:23:44,690 --> 00:23:47,015
+first have to test
+if it's nullable,
+
+470
+00:23:47,015 --> 00:23:49,040
+If it's not the have to push
+
+471
+00:23:49,040 --> 00:23:52,160
+the derivative over
+the first DR1.
+
+472
+00:23:52,160 --> 00:23:56,135
+And otherwise if it is null
+above we have two cases.
+
+473
+00:23:56,135 --> 00:24:00,605
+Either you have to push
+it over the R1 or R2.
+
+474
+00:24:00,605 --> 00:24:03,860
+And a star case, this one.
+
+475
+00:24:03,860 --> 00:24:07,160
+We just build the sequence
+of the derivative of
+
+476
+00:24:07,160 --> 00:24:11,600
+R1 and append the
+or as a sequence,
+
+477
+00:24:11,600 --> 00:24:14,195
+put the star at the end.
+
+478
+00:24:14,195 --> 00:24:19,430
+Okay, so here's this
+function for strings.
+
+479
+00:24:19,430 --> 00:24:21,740
+So a list of characters.
+
+480
+00:24:21,740 --> 00:24:23,870
+This function takes N,
+
+481
+00:24:23,870 --> 00:24:26,480
+S list of characters argument
+and a reg expression
+
+482
+00:24:26,480 --> 00:24:29,360
+as argument and returns
+a wreck expression.
+
+483
+00:24:29,360 --> 00:24:31,460
+And we just have to
+pattern match what
+
+484
+00:24:31,460 --> 00:24:34,055
+that string looks like
+or this list looks like.
+
+485
+00:24:34,055 --> 00:24:35,360
+If it's the empty list,
+
+486
+00:24:35,360 --> 00:24:38,030
+it just immediately
+return the R. If
+
+487
+00:24:38,030 --> 00:24:42,020
+it's something of the
+form C followed by S,
+
+488
+00:24:42,020 --> 00:24:45,050
+then we build first the
+derivative according
+
+489
+00:24:45,050 --> 00:24:48,345
+to a C. And then
+the result of that,
+
+490
+00:24:48,345 --> 00:24:50,090
+people recursively call
+
+491
+00:24:50,090 --> 00:24:53,675
+the derivative
+according to s. Okay?
+
+492
+00:24:53,675 --> 00:24:56,060
+And now the main mantra,
+
+493
+00:24:56,060 --> 00:24:59,360
+it takes a reg
+expression and takes
+
+494
+00:24:59,360 --> 00:25:02,870
+a string and returns a
+boolean, true or false.
+
+495
+00:25:02,870 --> 00:25:05,705
+And it first builds
+the derivative.
+
+496
+00:25:05,705 --> 00:25:07,430
+The only thing I have to do here
+
+497
+00:25:07,430 --> 00:25:09,410
+because I'm implementing
+it and scholar,
+
+498
+00:25:09,410 --> 00:25:15,064
+I have to coerce between strings
+and lists of characters.
+
+499
+00:25:15,064 --> 00:25:16,580
+So he, I take first
+
+500
+00:25:16,580 --> 00:25:20,810
+the two list of the S that
+gives me a list of characters.
+
+501
+00:25:20,810 --> 00:25:23,135
+Then I build this derivative
+
+502
+00:25:23,135 --> 00:25:26,045
+is ds because I'm
+looking at strings.
+
+503
+00:25:26,045 --> 00:25:28,160
+And then at the end,
+
+504
+00:25:28,160 --> 00:25:33,050
+built-in nullable of
+the final derivative.
+
+505
+00:25:33,050 --> 00:25:37,310
+The beauty of all this
+is that in Scala,
+
+506
+00:25:37,310 --> 00:25:40,085
+these definition which
+I had on the slides
+
+507
+00:25:40,085 --> 00:25:43,835
+go almost one-to-one into code.
+
+508
+00:25:43,835 --> 00:25:45,605
+And that's of course,
+
+509
+00:25:45,605 --> 00:25:47,480
+makes it very easy
+to implement in
+
+510
+00:25:47,480 --> 00:25:49,730
+a functional language like Scala.
+
+511
+00:25:49,730 --> 00:25:52,865
+We can also now try
+out some examples.
+
+512
+00:25:52,865 --> 00:25:56,960
+This was the example
+I had on the slide.
+
+513
+00:25:56,960 --> 00:25:58,370
+And we could now calculate
+
+514
+00:25:58,370 --> 00:26:00,305
+what's the derivative
+according to a,
+
+515
+00:26:00,305 --> 00:26:02,720
+B, and C of this
+record expression.
+
+516
+00:26:02,720 --> 00:26:07,040
+And I hope you didn't
+make any mistake.
+
+517
+00:26:07,040 --> 00:26:09,260
+Now of course we want
+to see whether B
+
+518
+00:26:09,260 --> 00:26:11,390
+do any better with
+this algorithm.
+
+519
+00:26:11,390 --> 00:26:13,715
+Then Python and Ruby.
+
+520
+00:26:13,715 --> 00:26:16,070
+And let's first have a
+look at this example.
+
+521
+00:26:16,070 --> 00:26:18,079
+So this reg expression.
+
+522
+00:26:18,079 --> 00:26:19,880
+The problem is that in
+
+523
+00:26:19,880 --> 00:26:22,070
+our reg expression
+metro so far we have
+
+524
+00:26:22,070 --> 00:26:24,530
+the sequence right
+expression where we
+
+525
+00:26:24,530 --> 00:26:27,200
+don't have this optional
+wreck expression.
+
+526
+00:26:27,200 --> 00:26:30,800
+And we don't have this n
+times Rekha expression,
+
+527
+00:26:30,800 --> 00:26:36,605
+but we can quickly implement
+that in our implementation.
+
+528
+00:26:36,605 --> 00:26:40,549
+So if you want to build the
+optional reg expression,
+
+529
+00:26:40,549 --> 00:26:41,870
+which is defined here,
+
+530
+00:26:41,870 --> 00:26:44,570
+a little function which
+takes a reg expression and
+
+531
+00:26:44,570 --> 00:26:47,360
+returns the alternative of R one.
+
+532
+00:26:47,360 --> 00:26:49,624
+So it essentially
+takes the definition
+
+533
+00:26:49,624 --> 00:26:53,240
+of optional R and
+replaces it by that.
+
+534
+00:26:53,240 --> 00:26:56,150
+The end times what we
+essentially do it,
+
+535
+00:26:56,150 --> 00:27:01,535
+There's beaches built n
+copies of this r. Ok?
+
+536
+00:27:01,535 --> 00:27:04,745
+So if this n times was 0,
+
+537
+00:27:04,745 --> 00:27:06,245
+we just return one.
+
+538
+00:27:06,245 --> 00:27:11,570
+If it's one, then we return
+just the reg expression.
+
+539
+00:27:11,570 --> 00:27:15,575
+And if it's a, something
+bigger than one,
+
+540
+00:27:15,575 --> 00:27:18,560
+then we just built the
+sequence of one of
+
+541
+00:27:18,560 --> 00:27:20,270
+these copies and call
+
+542
+00:27:20,270 --> 00:27:22,925
+this function again
+with n minus one.
+
+543
+00:27:22,925 --> 00:27:26,330
+So we just now n copies of that.
+
+544
+00:27:26,330 --> 00:27:30,710
+Okay? Okay, so we can look
+at our first example.
+
+545
+00:27:30,710 --> 00:27:32,629
+This is the work expression,
+
+546
+00:27:32,629 --> 00:27:36,035
+and I call that here
+even rec expression1.
+
+547
+00:27:36,035 --> 00:27:37,670
+It doesn't look as beautiful
+
+548
+00:27:37,670 --> 00:27:39,140
+as what we write down on paper.
+
+549
+00:27:39,140 --> 00:27:41,240
+We will actually look
+at this later on
+
+550
+00:27:41,240 --> 00:27:43,640
+if this can be improved.
+But here it is.
+
+551
+00:27:43,640 --> 00:27:45,724
+Here's the reg expression,
+
+552
+00:27:45,724 --> 00:27:49,520
+and here's a little function
+which measures the time.
+
+553
+00:27:49,520 --> 00:27:53,180
+And we can now start testing
+
+554
+00:27:53,180 --> 00:27:57,845
+this reg expression with
+strings of just containing A's.
+
+555
+00:27:57,845 --> 00:28:00,020
+And we first a bit cautious,
+
+556
+00:28:00,020 --> 00:28:04,985
+be tested between 020
+and be count by two.
+
+557
+00:28:04,985 --> 00:28:08,330
+And I actually will not
+start that with Scala,
+
+558
+00:28:08,330 --> 00:28:12,965
+but actually the orbital online.
+
+559
+00:28:12,965 --> 00:28:15,305
+And that's out.
+
+560
+00:28:15,305 --> 00:28:17,269
+And that correlates.
+
+561
+00:28:17,269 --> 00:28:20,675
+So for 18 days it's pretty fast.
+
+562
+00:28:20,675 --> 00:28:24,815
+But for 20 it already
+needs to seconds.
+
+563
+00:28:24,815 --> 00:28:28,265
+And you can see
+actually this jump from
+
+564
+00:28:28,265 --> 00:28:32,825
+18 to 20 in the runtime
+is prepared to.
+
+565
+00:28:32,825 --> 00:28:37,460
+And if we actually measure
+it more accurately,
+
+566
+00:28:37,460 --> 00:28:39,770
+then we get this picture.
+
+567
+00:28:39,770 --> 00:28:41,255
+And what turns out,
+
+568
+00:28:41,255 --> 00:28:45,830
+we actually inverse as Python
+and Ruby in this example.
+
+569
+00:28:45,830 --> 00:28:49,230
+So I guess that's a fail.