videos/01-basics2.srt
changeset 766 e8402d8ec8e6
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/videos/01-basics2.srt	Tue Sep 29 12:52:07 2020 +0100
@@ -0,0 +1,1627 @@
+1
+00:00:05,810 --> 00:00:11,039
+Okay, so we have strings.
+The empty string and string "abc".
+
+2
+00:00:11,039 --> 00:00:13,200
+And we can take the head
+
+3
+00:00:13,200 --> 00:00:15,615
+of a string and the
+tail of the string,
+
+4
+00:00:15,615 --> 00:00:18,480
+since we regard them as
+lists of characters.
+
+5
+00:00:18,480 --> 00:00:22,230
+We also introduced the
+notion of a language.
+
+6
+00:00:22,230 --> 00:00:25,305
+A language is just
+a set of strings.
+
+7
+00:00:25,305 --> 00:00:26,700
+So here's a language.
+
+8
+00:00:26,700 --> 00:00:28,275
+It contains the empty string,
+
+9
+00:00:28,275 --> 00:00:30,360
+the string hello, string foobar,
+
+10
+00:00:30,360 --> 00:00:31,965
+the string, which is
+
+11
+00:00:31,965 --> 00:00:34,874
+just a character a and
+the string abc.
+
+12
+00:00:34,874 --> 00:00:37,800
+There will be also other
+languages, which for example,
+
+13
+00:00:37,800 --> 00:00:39,780
+contain infinitely many strings.
+
+14
+00:00:39,780 --> 00:00:42,190
+Actually that will
+happen quite often.
+
+15
+00:00:42,190 --> 00:00:45,560
+Now, you've seen this operation
+
+16
+00:00:45,560 --> 00:00:47,660
+of concatenation of two strings.
+
+17
+00:00:47,660 --> 00:00:50,840
+So if you have the string
+foo and a string bar,
+
+18
+00:00:50,840 --> 00:00:53,255
+we can just put them
+next to each other.
+
+19
+00:00:53,255 --> 00:00:57,725
+We will now extend that
+notion also told languages.
+
+20
+00:00:57,725 --> 00:01:02,270
+So assume you have a
+language A and a language B.
+
+21
+00:01:02,270 --> 00:01:05,825
+That means we take every
+string from the language
+
+22
+00:01:05,825 --> 00:01:11,195
+A and concatenate it with every
+string of the language B.
+
+23
+00:01:11,195 --> 00:01:13,160
+Here's an example.
+
+24
+00:01:13,160 --> 00:01:14,750
+If you have the language A
+
+25
+00:01:14,750 --> 00:01:17,030
+containing foo and bar as strings,
+
+26
+00:01:17,030 --> 00:01:20,585
+and the language B containing
+a and b as strings.
+
+27
+00:01:20,585 --> 00:01:23,000
+Then concatenating
+these two languages
+
+28
+00:01:23,000 --> 00:01:27,110
+means I take foo with this a and b,
+
+29
+00:01:27,110 --> 00:01:28,625
+giving fooa and foob,
+
+30
+00:01:28,625 --> 00:01:30,905
+and I take bar and
+concatenated with
+
+31
+00:01:30,905 --> 00:01:34,160
+a and b, giving bara and barb.
+
+32
+00:01:34,160 --> 00:01:36,185
+So the pointer is we're
+
+33
+00:01:36,185 --> 00:01:39,110
+overloading this notion
+of concatenating
+
+34
+00:01:39,110 --> 00:01:45,020
+two strings also to
+concatenating two languages.
+
+35
+00:01:45,020 --> 00:01:48,890
+Okay, here are two corner
+cases of this definition.
+
+36
+00:01:48,890 --> 00:01:51,395
+What happens if I
+have any language,
+
+37
+00:01:51,395 --> 00:01:54,470
+say A, and I concatenate it with
+
+38
+00:01:54,470 --> 00:01:58,640
+the language which contains
+only the empty string.
+
+39
+00:01:58,640 --> 00:02:02,270
+And secondly, if I
+have any language,
+
+40
+00:02:02,270 --> 00:02:04,220
+say again A, and I
+
+41
+00:02:04,220 --> 00:02:05,870
+concatenate it with the language
+
+42
+00:02:05,870 --> 00:02:08,345
+which doesn't contain any string.
+
+43
+00:02:08,345 --> 00:02:11,720
+How would these two
+cases be defined?
+
+44
+00:02:11,720 --> 00:02:16,055
+Remember, this language here
+contains a single element,
+
+45
+00:02:16,055 --> 00:02:17,645
+namely the empty string,
+
+46
+00:02:17,645 --> 00:02:20,210
+while this language 
+
+47
+00:02:20,210 --> 00:02:23,585
+does not contain any
+string. It is empty.
+
+48
+00:02:23,585 --> 00:02:26,930
+So these are two
+different languages.
+
+49
+00:02:26,930 --> 00:02:29,960
+Think about how this
+would be defined.
+
+50
+00:02:29,960 --> 00:02:33,455
+So what's the point of
+all these definitions?
+
+51
+00:02:33,455 --> 00:02:36,110
+Well, we want to make precise
+
+52
+00:02:36,110 --> 00:02:38,735
+what is the meaning of
+a regular expression.
+
+53
+00:02:38,735 --> 00:02:41,510
+For this, we use
+this function L. It
+
+54
+00:02:41,510 --> 00:02:44,180
+will take a regular expression as
+
+55
+00:02:44,180 --> 00:02:47,150
+argument and generates a
+
+56
+00:02:47,150 --> 00:02:49,970
+set of strings or a language.
+
+57
+00:02:49,970 --> 00:02:52,670
+And it's supposed to be the
+language which contains
+
+58
+00:02:52,670 --> 00:02:56,330
+all the strings this regular
+expression can match.
+
+59
+00:02:56,330 --> 00:02:58,670
+So remember the 0
+regular expression,
+
+60
+00:02:58,670 --> 00:03:01,115
+it's not supposed to
+match any string.
+
+61
+00:03:01,115 --> 00:03:05,105
+So we give it the empty
+language or empty set.
+
+62
+00:03:05,105 --> 00:03:06,740
+This regular expression,
+
+63
+00:03:06,740 --> 00:03:09,380
+the one regular expression
+is supposed to match
+
+64
+00:03:09,380 --> 00:03:13,445
+exactly one string,
+namely the empty string.
+
+65
+00:03:13,445 --> 00:03:15,710
+the regular expression
+
+66
+00:03:15,710 --> 00:03:18,875
+recognizing just the
+character c, in this case.
+
+67
+00:03:18,875 --> 00:03:22,820
+Well, that will be the
+language which contains
+
+68
+00:03:22,820 --> 00:03:28,175
+the string only containing
+the single character c. 
+
+69
+00:03:28,175 --> 00:03:31,295
+Now, what is this regular
+expression supposed to match?
+
+70
+00:03:31,295 --> 00:03:34,835
+Well, a string is matched
+by this regular expression
+
+71
+00:03:34,835 --> 00:03:37,355
+if it is matched by
+the first component, r1,
+
+72
+00:03:37,355 --> 00:03:40,970
+or by the second component, r2.
+
+73
+00:03:40,970 --> 00:03:43,190
+That's why this plus
+is the alternative.
+
+74
+00:03:43,190 --> 00:03:45,605
+So which are the strings
+
+75
+00:03:45,605 --> 00:03:48,320
+this regular expression
+can match? Well,
+
+76
+00:03:48,320 --> 00:03:51,275
+all the strings
+r1 one can match,
+
+77
+00:03:51,275 --> 00:03:54,170
+given by L(r1) union...
+
+78
+00:03:54,170 --> 00:03:57,380
+all the strings
+R2 can match.
+
+79
+00:03:57,380 --> 00:04:01,325
+And we use here the union
+because this alternative,
+
+80
+00:04:01,325 --> 00:04:05,945
+even if it says this string
+is matched by r1 or by r2.
+
+81
+00:04:05,945 --> 00:04:08,465
+The meaning of this
+reg expression
+
+82
+00:04:08,465 --> 00:04:11,209
+is the union of both languages.
+
+83
+00:04:11,209 --> 00:04:14,315
+Now the sequence case.
+
+84
+00:04:14,315 --> 00:04:17,960
+This means a string is
+matched by this regular expression
+
+85
+00:04:17,960 --> 00:04:20,510
+if the first part of this
+string is matched by r1
+
+86
+00:04:20,510 --> 00:04:24,935
+and the second part of
+the string is matched by r2.
+
+87
+00:04:24,935 --> 00:04:28,490
+Now we have to say, which
+are all the strings
+
+88
+00:04:28,490 --> 00:04:31,645
+this regular expression
+can match. Well, it's
+
+89
+00:04:31,645 --> 00:04:34,495
+all the strings
+r1 can match,
+
+90
+00:04:34,495 --> 00:04:39,205
+concatenated with all the
+strings r2 can match.
+
+91
+00:04:39,205 --> 00:04:42,895
+So for this, we have to use
+this concatenation operation.
+
+92
+00:04:42,895 --> 00:04:47,440
+So if r1 can match a string
+and r2 can match a string,
+
+93
+00:04:47,440 --> 00:04:51,220
+then in the meaning will be
+the concatenation of that.
+
+94
+00:04:51,220 --> 00:04:53,170
+So with this function L
+
+95
+00:04:53,170 --> 00:04:56,995
+we can make precise
+what are the strings,
+
+96
+00:04:56,995 --> 00:05:00,205
+*all* the strings a regular
+expression can match.
+
+97
+00:05:00,205 --> 00:05:04,689
+The only case we haven't
+specified yet is the r*.
+
+98
+00:05:04,689 --> 00:05:07,750
+For this, we need
+another operation
+
+99
+00:05:07,750 --> 00:05:12,470
+on languages or sets of
+strings, which we do next.
+
+100
+00:05:12,490 --> 00:05:17,285
+We introduce a power
+operation for languages.
+
+101
+00:05:17,285 --> 00:05:19,100
+The power operation or
+
+102
+00:05:19,100 --> 00:05:22,835
+the nth power is
+defined recursively.
+
+103
+00:05:22,835 --> 00:05:26,615
+So the A to the power of 0
+
+104
+00:05:26,615 --> 00:05:31,205
+would be defined as the set
+containing the empty string.
+
+105
+00:05:31,205 --> 00:05:36,200
+And A to the power of
+n + 1 would be A
+
+106
+00:05:36,200 --> 00:05:38,705
+concatenated with
+A to the power of
+
+107
+00:05:38,705 --> 00:05:42,160
+n. Here are a few examples.
+
+108
+00:05:42,160 --> 00:05:45,380
+A to the power of four
+would be essentially
+
+109
+00:05:45,380 --> 00:05:47,660
+A concatenated with
+A concatenated with
+
+110
+00:05:47,660 --> 00:05:49,640
+A concatenated with A,
+
+111
+00:05:49,640 --> 00:05:51,650
+and also concatenated with
+
+112
+00:05:51,650 --> 00:05:55,580
+the language which
+contains the empty string.
+
+113
+00:05:55,580 --> 00:05:57,275
+Because that's how we define
+
+114
+00:05:57,275 --> 00:05:59,975
+the base case, A
+to the power of 0.
+
+115
+00:05:59,975 --> 00:06:03,410
+And A to the power
+one would be just A,
+
+116
+00:06:03,410 --> 00:06:05,330
+again followed by that one,
+
+117
+00:06:05,330 --> 00:06:07,790
+but this would be just A.
+
+118
+00:06:07,790 --> 00:06:10,385
+And A to the power 0
+
+119
+00:06:10,385 --> 00:06:14,270
+is this language which
+contains the empty string.
+
+120
+00:06:14,270 --> 00:06:16,670
+With this power operation,
+
+121
+00:06:16,670 --> 00:06:19,505
+I can also fill in this case.
+
+122
+00:06:19,505 --> 00:06:23,210
+Remember this r* operation
+is supposed to match
+
+123
+00:06:23,210 --> 00:06:27,680
+a string with either
+0 or more copies of r.
+
+124
+00:06:27,680 --> 00:06:31,940
+So the meaning of this
+regular expression would be
+
+125
+00:06:31,940 --> 00:06:37,475
+now the Union of all the
+powers greater or equal than 0,
+
+126
+00:06:37,475 --> 00:06:40,835
+of the language that r can match.
+
+127
+00:06:40,835 --> 00:06:44,945
+And we take now at the
+union of all these powers,
+
+128
+00:06:44,945 --> 00:06:47,150
+that means essentially what
+
+129
+00:06:47,150 --> 00:06:48,530
+can the regular expression match
+
+130
+00:06:48,530 --> 00:06:51,440
+if I take L of r
+to the 0th power,
+
+131
+00:06:51,440 --> 00:06:53,030
+what can I match with
+
+132
+00:06:53,030 --> 00:06:57,305
+the first power, the
+second power, and so on.
+
+133
+00:06:57,305 --> 00:07:00,544
+And I take the union
+of all these powers.
+
+134
+00:07:00,544 --> 00:07:03,830
+That's the definition
+of which strings
+
+135
+00:07:03,830 --> 00:07:05,510
+this regular expression
+is supposed to
+
+136
+00:07:05,510 --> 00:07:08,510
+match. The meaning of
+this regular expression.
+
+137
+00:07:08,510 --> 00:07:11,300
+This is such an
+important definition,
+
+138
+00:07:11,300 --> 00:07:13,250
+that there's actually
+a name for it.
+
+139
+00:07:13,250 --> 00:07:15,020
+It's called the Kleene star.
+
+140
+00:07:15,020 --> 00:07:16,400
+And it's written like this.
+
+141
+00:07:16,400 --> 00:07:18,335
+If I have a language A,
+
+142
+00:07:18,335 --> 00:07:21,785
+I can build the star
+of this language
+
+143
+00:07:21,785 --> 00:07:26,255
+by union up all the powers
+greater or equal than 0.
+
+144
+00:07:26,255 --> 00:07:28,580
+The A-star of
+
+145
+00:07:28,580 --> 00:07:31,580
+a language would expand
+to a to the power 0,
+
+146
+00:07:31,580 --> 00:07:33,770
+union A to the power of one,
+
+147
+00:07:33,770 --> 00:07:36,665
+A to the power of two, and so on.
+
+148
+00:07:36,665 --> 00:07:39,230
+And it would
+essentially mean, well,
+
+149
+00:07:39,230 --> 00:07:43,460
+it's the language which contains
+the empty string because
+
+150
+00:07:43,460 --> 00:07:48,290
+of the A to the 0, one copy of A,
+
+151
+00:07:48,290 --> 00:07:51,020
+two concatenated copies of A,
+
+152
+00:07:51,020 --> 00:07:55,070
+three concatenated
+copies of A, and so on.
+
+153
+00:07:55,070 --> 00:07:59,060
+So that's what this A-star
+is defined as.
+
+154
+00:07:59,060 --> 00:08:03,725
+Essentially as the Union
+of all the powers.
+
+155
+00:08:03,725 --> 00:08:05,990
+And it's essentially
+each power is how many
+
+156
+00:08:05,990 --> 00:08:08,750
+times I have to
+concatenate this language A.
+
+157
+00:08:08,750 --> 00:08:13,549
+And note that this language
+A-star will always contain
+
+158
+00:08:13,549 --> 00:08:21,240
+the empty string because that
+is how the A^0 is defined.
+
+159
+00:08:21,310 --> 00:08:23,540
+So in this definition,
+
+160
+00:08:23,540 --> 00:08:25,969
+I could have also written star
+
+161
+00:08:25,969 --> 00:08:29,180
+here because the
+meaning of this r*,
+
+162
+00:08:29,180 --> 00:08:34,055
+regular expression
+is the Kleene star of
+
+163
+00:08:34,055 --> 00:08:37,130
+the language of r.
+Remember that's
+
+164
+00:08:37,130 --> 00:08:41,525
+the union of all powers
+greater or equal than 0.
+
+165
+00:08:41,525 --> 00:08:43,640
+It's behind.
+I hope you don't get
+
+166
+00:08:43,640 --> 00:08:45,800
+confused by the use of the stars.
+
+167
+00:08:45,800 --> 00:08:48,845
+This star is for
+the regular expressions.
+
+168
+00:08:48,845 --> 00:08:52,205
+That star is for languages.
+
+169
+00:08:52,205 --> 00:08:54,274
+They are two
+different operations.
+
+170
+00:08:54,274 --> 00:08:56,555
+They don't even
+have the same type.
+
+171
+00:08:56,555 --> 00:08:58,954
+Here might be
+something interesting.
+
+172
+00:08:58,954 --> 00:09:00,560
+Remember I asked you to think
+
+173
+00:09:00,560 --> 00:09:03,035
+about these two corner cases.
+
+174
+00:09:03,035 --> 00:09:04,730
+I hope you have done so,
+
+175
+00:09:04,730 --> 00:09:07,070
+but let's also have look
+at this together.
+
+176
+00:09:07,070 --> 00:09:09,785
+According to the definition
+
+177
+00:09:09,785 --> 00:09:11,930
+for this append of languages,
+
+178
+00:09:11,930 --> 00:09:13,670
+I have to take every string in
+
+179
+00:09:13,670 --> 00:09:16,130
+this set and have
+
+180
+00:09:16,130 --> 00:09:18,695
+two concatenated with
+every string in that set.
+
+181
+00:09:18,695 --> 00:09:22,115
+So this set contains only
+the empty string as element.
+
+182
+00:09:22,115 --> 00:09:24,820
+So if I concatenate anything in
+
+183
+00:09:24,820 --> 00:09:27,745
+there with the empty string,
+that will be left untouched.
+
+184
+00:09:27,745 --> 00:09:31,855
+So this one will be
+actually A.
+
+185
+00:09:31,855 --> 00:09:34,600
+This one I have to
+take every string in
+
+186
+00:09:34,600 --> 00:09:36,190
+this language and I have to
+
+187
+00:09:36,190 --> 00:09:39,115
+concatenate with every
+string in that language.
+
+188
+00:09:39,115 --> 00:09:41,110
+But here isn't any string.
+
+189
+00:09:41,110 --> 00:09:43,300
+So I can't concatenate that.
+
+190
+00:09:43,300 --> 00:09:46,900
+That will be actually
+the empty set (not empty string).
+
+191
+00:09:46,900 --> 00:09:49,570
+So now let's have
+
+192
+00:09:49,570 --> 00:09:51,670
+a look at regular expressions.
+
+193
+00:09:51,670 --> 00:09:53,230
+That was with languages.
+
+194
+00:09:53,230 --> 00:09:56,035
+But let's have a look at
+regular expressions now.
+
+195
+00:09:56,035 --> 00:09:58,660
+What would be the
+meaning, for example,
+
+196
+00:09:58,660 --> 00:10:01,945
+of r followed by 1?
+
+197
+00:10:01,945 --> 00:10:04,149
+That is a regular expression.
+
+198
+00:10:04,149 --> 00:10:06,255
+And it essentially says:
+
+199
+00:10:06,255 --> 00:10:07,850
+this regular expression matches
+
+200
+00:10:07,850 --> 00:10:10,685
+some strings and this matches
+the empty string.
+
+201
+00:10:10,685 --> 00:10:13,730
+So if you find out
+what the meaning is,
+
+202
+00:10:13,730 --> 00:10:16,955
+we would apply this
+L-function to it.
+
+203
+00:10:16,955 --> 00:10:19,430
+And if we now look
+up the definition,
+
+204
+00:10:19,430 --> 00:10:27,360
+that would be L of r
+appended L of 1.
+
+205
+00:10:27,850 --> 00:10:32,255
+If you look up what
+this is defined,
+
+206
+00:10:32,255 --> 00:10:41,640
+then this would be L of r
+appended with this language.
+
+207
+00:10:41,950 --> 00:10:46,325
+And if you now look up
+our definition earlier,
+
+208
+00:10:46,325 --> 00:10:50,455
+that will be just L of r.
+
+209
+00:10:50,455 --> 00:10:52,690
+What that essentially
+
+210
+00:10:52,690 --> 00:10:55,765
+means is if you have
+this regular expression,
+
+211
+00:10:55,765 --> 00:10:58,885
+this will match
+exactly those strings
+
+212
+00:10:58,885 --> 00:11:01,000
+which this regular
+expression can match.
+
+213
+00:11:01,000 --> 00:11:04,794
+And that pretty much
+coincides with our intuition.
+
+214
+00:11:04,794 --> 00:11:05,950
+This is supposed to match
+
+215
+00:11:05,950 --> 00:11:08,410
+the empty string at
+the end of the string,
+
+216
+00:11:08,410 --> 00:11:11,005
+so it doesn't really
+make any difference.
+
+217
+00:11:11,005 --> 00:11:13,480
+And the meaning
+really tells us that.
+
+218
+00:11:13,480 --> 00:11:15,880
+But here's the
+interesting thing.
+
+219
+00:11:15,880 --> 00:11:17,979
+When you were in school,
+
+220
+00:11:17,979 --> 00:11:23,124
+how would you think
+about this expression?
+
+221
+00:11:23,124 --> 00:11:29,515
+Well, r times 1
+equals just our, okay?
+
+222
+00:11:29,515 --> 00:11:33,634
+Furthermore, if you had r times 0.
+
+223
+00:11:33,634 --> 00:11:35,045
+What would that be equal?
+
+224
+00:11:35,045 --> 00:11:37,205
+Well, it would be 0.
+
+225
+00:11:37,205 --> 00:11:39,650
+Ok, let's have
+
+226
+00:11:39,650 --> 00:11:42,605
+a look how that works out
+with regular expressions.
+
+227
+00:11:42,605 --> 00:11:46,355
+So if you take r followed by 0,
+
+228
+00:11:46,355 --> 00:11:48,260
+remember this is
+the regular expression
+
+229
+00:11:48,260 --> 00:11:49,895
+that cannot match anything.
+
+230
+00:11:49,895 --> 00:11:52,415
+By unfolding the definition,
+
+231
+00:11:52,415 --> 00:11:59,104
+I would get L of r
+appended with L of 0.
+
+232
+00:11:59,104 --> 00:12:01,775
+This is of course defined as L of r
+
+233
+00:12:01,775 --> 00:12:05,915
+appended with the empty set.
+
+234
+00:12:05,915 --> 00:12:10,760
+And this would, according
+to the definition earlier,
+
+235
+00:12:10,760 --> 00:12:13,830
+well just the empty set.
+
+236
+00:12:14,160 --> 00:12:20,260
+And this would be
+of course L of 0.
+
+237
+00:12:20,260 --> 00:12:24,580
+So it seems like these laws,
+
+238
+00:12:24,580 --> 00:12:27,175
+at least for the times,
+
+239
+00:12:27,175 --> 00:12:29,785
+seem to be valid from math,
+
+240
+00:12:29,785 --> 00:12:31,420
+are also valid with regular expressions,
+
+241
+00:12:31,420 --> 00:12:33,370
+if we look at their meaning.
+
+242
+00:12:33,370 --> 00:12:36,670
+These laws on natural numbers
+
+243
+00:12:36,670 --> 00:12:40,105
+and regular expressions and
+their close correspondance
+
+244
+00:12:40,105 --> 00:12:42,460
+probably explain why I use 
+
+245
+00:12:42,460 --> 00:12:46,975
+a times for the sequence
+regular expression.
+
+246
+00:12:46,975 --> 00:12:51,040
+So r followed by the
+regular expression 1,
+
+247
+00:12:51,040 --> 00:12:54,505
+will have the same meaning
+as that regular expression.
+
+248
+00:12:54,505 --> 00:12:59,120
+And r followed by the
+0 regular expression
+
+249
+00:12:59,120 --> 00:13:01,295
+will have this meaning.
+
+250
+00:13:01,295 --> 00:13:03,590
+You might now think, but I also
+
+251
+00:13:03,590 --> 00:13:06,605
+wrote the alternative with a plus.
+
+252
+00:13:06,605 --> 00:13:10,370
+Does a similar law
+holds for plus?
+
+253
+00:13:10,370 --> 00:13:15,965
+Of course, if I take r
+plus 0 on natural numbers,
+
+254
+00:13:15,965 --> 00:13:20,060
+that would be just r. Does
+hold for regular expressions?
+
+255
+00:13:20,060 --> 00:13:22,085
+Yes, indeed it holds.
+
+256
+00:13:22,085 --> 00:13:26,735
+If I have this
+regular expression and
+
+257
+00:13:26,735 --> 00:13:33,245
+unfold the definition that
+would be L(r) union L(0).
+
+258
+00:13:33,245 --> 00:13:38,060
+This would be equal
+to L(r) union...
+
+259
+00:13:38,060 --> 00:13:41,150
+What is this? Well, that's
+just the empty set.
+
+260
+00:13:41,150 --> 00:13:43,760
+And if I union something
+with the empty set,
+
+261
+00:13:43,760 --> 00:13:47,180
+then this will be
+just of L(r). So yes,
+
+262
+00:13:47,180 --> 00:13:50,120
+indeed, plus is also very much
+
+263
+00:13:50,120 --> 00:13:54,184
+inspired by the laws
+on natural numbers.
+
+264
+00:13:54,184 --> 00:13:57,065
+There exists other notations too,
+
+265
+00:13:57,065 --> 00:14:01,310
+but I prefer this
+using the plus for
+
+266
+00:14:01,310 --> 00:14:03,844
+the alternatives and the times
+
+267
+00:14:03,844 --> 00:14:05,765
+for the sequence expression.
+
+268
+00:14:05,765 --> 00:14:07,460
+It's also the reason why I call
+
+269
+00:14:07,460 --> 00:14:10,325
+this regular expression for
+the empty string 1.
+
+270
+00:14:10,325 --> 00:14:14,915
+And for matching
+nothing at all 0.
+
+271
+00:14:14,915 --> 00:14:18,665
+This correspondence to
+natural numbers doesn't quite
+
+272
+00:14:18,665 --> 00:14:22,055
+extend to the star
+regular expression.
+
+273
+00:14:22,055 --> 00:14:25,055
+But there's still a
+connection. There too.
+
+274
+00:14:25,055 --> 00:14:26,510
+Can you remember how
+
+275
+00:14:26,510 --> 00:14:29,345
+the power operation on
+languages was defined?
+
+276
+00:14:29,345 --> 00:14:31,370
+The 0 case was defined
+
+277
+00:14:31,370 --> 00:14:34,520
+as the set containing
+the empty string.
+
+278
+00:14:34,520 --> 00:14:37,039
+Why is that? It looks
+a bit arbitrary.
+
+279
+00:14:37,039 --> 00:14:41,150
+Why is it not the empty set
+or why is it not defined as A?
+
+280
+00:14:41,150 --> 00:14:43,745
+Why is defined in this
+particular way?
+
+281
+00:14:43,745 --> 00:14:46,880
+Well, can you remember how
+
+282
+00:14:46,880 --> 00:14:48,950
+the power operation r to the
+
+283
+00:14:48,950 --> 00:14:51,440
+0 is defined on natural numbers?
+
+284
+00:14:51,440 --> 00:14:53,930
+Yes, that's defined as 1.
+
+285
+00:14:53,930 --> 00:14:57,125
+Does that also apply to
+regular expressions?
+
+286
+00:14:57,125 --> 00:15:00,725
+Well, if we take the meaning
+of a regular expression,
+
+287
+00:15:00,725 --> 00:15:04,730
+let's say r, and raise
+it to the 0th power,
+
+288
+00:15:04,730 --> 00:15:07,685
+then this will be, of
+course, by definition,
+
+289
+00:15:07,685 --> 00:15:12,245
+be defined as the set 
+containing the empty string.
+
+290
+00:15:12,245 --> 00:15:17,160
+And that is of course
+equal to L(1).
+
+291
+00:15:17,830 --> 00:15:20,570
+Again, you can see that
+
+292
+00:15:20,570 --> 00:15:23,300
+this law on natural numbers also
+
+293
+00:15:23,300 --> 00:15:29,000
+holds on the laws of regular
+expressions - on heir meaning.
+
+294
+00:15:29,000 --> 00:15:32,810
+What I find really beautiful
+here is that of course,
+
+295
+00:15:32,810 --> 00:15:36,244
+the meaning is defined on
+sets, sets of strings.
+
+296
+00:15:36,244 --> 00:15:38,975
+This of course, on natural numbers.
+
+297
+00:15:38,975 --> 00:15:41,060
+You can probably
+see now where I'm
+
+298
+00:15:41,060 --> 00:15:43,085
+coming from with my notation.
+
+299
+00:15:43,085 --> 00:15:46,205
+That is actually a slight
+problem in this field,
+
+300
+00:15:46,205 --> 00:15:48,590
+since it's so old.
+
+301
+00:15:48,590 --> 00:15:52,205
+Pretty much everybody
+introduced their own notation.
+
+302
+00:15:52,205 --> 00:15:53,870
+And you now have heaps of
+
+303
+00:15:53,870 --> 00:15:55,550
+different notations
+for the same thing.
+
+304
+00:15:55,550 --> 00:15:57,379
+Okay, that is slightly
+exaggerating.
+
+305
+00:15:57,379 --> 00:16:00,830
+And also the notation
+I used for times and
+
+306
+00:16:00,830 --> 00:16:04,295
+plus and 0 and 1s...definitely
+I'm not the only one.
+
+307
+00:16:04,295 --> 00:16:05,435
+There are many people
+
+308
+00:16:05,435 --> 00:16:07,625
+who have used this
+notation as well.
+
+309
+00:16:07,625 --> 00:16:10,190
+It's just, it's not universal.
+
+310
+00:16:10,190 --> 00:16:12,290
+Well, here's a question now.
+
+311
+00:16:12,290 --> 00:16:15,635
+Why did we go through this
+kerfuffle in the first place?
+
+312
+00:16:15,635 --> 00:16:19,370
+Why did we introduce these
+operations on languages?
+
+313
+00:16:19,370 --> 00:16:21,260
+And why did we introduce
+
+314
+00:16:21,260 --> 00:16:23,255
+the meaning of a
+regular expression?
+
+315
+00:16:23,255 --> 00:16:26,300
+Well, remember at the beginning,
+
+316
+00:16:26,300 --> 00:16:28,265
+we wanted to specify
+
+317
+00:16:28,265 --> 00:16:31,880
+what problem our algorithms
+actually supposed to solve.
+
+318
+00:16:31,880 --> 00:16:35,960
+And we want to make that
+very precise so that we can
+
+319
+00:16:35,960 --> 00:16:38,555
+be sure that our implementation
+
+320
+00:16:38,555 --> 00:16:40,295
+really solves the problem.
+
+321
+00:16:40,295 --> 00:16:41,540
+And that's what you can do now.
+
+322
+00:16:41,540 --> 00:16:45,605
+We can say a regular
+expression, r say,
+
+323
+00:16:45,605 --> 00:16:50,015
+is matching a string
+s if and only if
+
+324
+00:16:50,015 --> 00:16:55,474
+the string s is in the language
+of the regular expression.
+
+325
+00:16:55,474 --> 00:17:00,770
+So that is the problem our
+matcher is supposed to solve.
+
+326
+00:17:00,770 --> 00:17:03,320
+And it's supposed to
+solve that a bit faster
+
+327
+00:17:03,320 --> 00:17:06,860
+than in Python, Ruby and Java.
+
+328
+00:17:06,860 --> 00:17:09,585
+And unfortunately we cannot use
+
+329
+00:17:09,585 --> 00:17:12,260
+the definition of L
+directly for that.
+
+330
+00:17:12,260 --> 00:17:15,815
+Because remember, in
+the case of the star,
+
+331
+00:17:15,815 --> 00:17:17,690
+sometimes the meaning of
+
+332
+00:17:17,690 --> 00:17:18,830
+a regular expression is actually
+
+333
+00:17:18,830 --> 00:17:20,925
+an infinite set of strings.
+
+334
+00:17:20,925 --> 00:17:23,575
+And it's a tiny bit difficult
+
+335
+00:17:23,575 --> 00:17:27,040
+to decide whether a string
+is an infinite set.
+
+336
+00:17:27,040 --> 00:17:30,790
+So we have to do something
+more clever in our algorithm.
+
+337
+00:17:30,790 --> 00:17:33,535
+But that's for next week.
+So see you next week. Bye.
+
+338
+00:17:38,535 --> 00:17:41,680
+Okay, just to troll you.
+Here's one more slide.
+
+339
+00:17:41,680 --> 00:17:45,850
+So when you go over the
+handouts and the videos,
+
+340
+00:17:45,850 --> 00:17:47,875
+think about this example.
+
+341
+00:17:47,875 --> 00:17:49,840
+Imagine you have a language A,
+
+342
+00:17:49,840 --> 00:17:51,970
+which contains four strings.
+
+343
+00:17:51,970 --> 00:17:53,725
+First string is just the character a,
+
+344
+00:17:53,725 --> 00:17:57,445
+second string is just
+the character b, and so on.
+
+345
+00:17:57,445 --> 00:18:02,335
+How many strings are there
+in A to the power four.
+
+346
+00:18:02,335 --> 00:18:05,210
+Okay, that should be
+relatively simple.
+
+347
+00:18:05,210 --> 00:18:07,310
+If not, just try it out by
+
+348
+00:18:07,310 --> 00:18:11,165
+implementing it in Scala and see
+how many strings there are.
+
+349
+00:18:11,165 --> 00:18:13,850
+But now the more
+interesting question,
+
+350
+00:18:13,850 --> 00:18:16,670
+imagine the same language,
+
+351
+00:18:16,670 --> 00:18:19,400
+except that instead
+of the character d,
+
+352
+00:18:19,400 --> 00:18:22,250
+we have here, the empty string.
+
+353
+00:18:22,250 --> 00:18:26,630
+How many strings are in
+A to the power four?
+
+354
+00:18:26,630 --> 00:18:31,320
+Okay, I'll let you think
+about this. Bye now.