videos/01-evilregexes.srt
author Christian Urban <christian.urban@kcl.ac.uk>
Wed, 23 Sep 2020 11:34:43 +0100
changeset 761 fb07ac060866
child 765 b66602e0b42d
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
761
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     1
1
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     2
00:00:06,240 --> 00:00:11,050
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     3
Welcome back. This video
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     4
is about regular expressions.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     5
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     6
2
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     7
00:00:11,050 --> 00:00:14,230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     8
We want to use regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
     9
expressions in our lexer.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    10
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    11
3
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    12
00:00:14,230 --> 00:00:16,165
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    13
And the purpose of the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    14
lexer is to find
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    15
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    16
4
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    17
00:00:16,165 --> 00:00:18,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    18
out where the words in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    19
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    20
5
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    21
00:00:18,070 --> 00:00:21,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    22
our programs are. However
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    23
regular expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    24
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    25
6
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    26
00:00:21,070 --> 00:00:23,875
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    27
are fundamental tool
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    28
in computer science.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    29
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    30
7
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    31
00:00:23,875 --> 00:00:27,910
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    32
And I'm sure you've used them
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    33
already on several occasions.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    34
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    35
8
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    36
00:00:27,910 --> 00:00:30,370
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    37
And one would expect that about
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    38
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    39
9
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    40
00:00:30,370 --> 00:00:31,750
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    41
regular expressions since they are
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    42
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    43
10
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    44
00:00:31,750 --> 00:00:33,850
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    45
so well-known and well studied,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    46
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    47
11
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    48
00:00:33,850 --> 00:00:37,915
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    49
that everything under the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    50
sun is known about them.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    51
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    52
12
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    53
00:00:37,915 --> 00:00:41,080
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    54
But actually there's
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    55
still some surprising
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    56
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    57
13
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    58
00:00:41,080 --> 00:00:44,465
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    59
and interesting
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    60
problems with them.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    61
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    62
14
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    63
00:00:44,465 --> 00:00:47,945
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    64
And I want to show you
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    65
them in this video.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    66
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    67
15
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    68
00:00:47,945 --> 00:00:50,720
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    69
I'm sure you've seen
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    70
regular expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    71
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    72
16
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    73
00:00:50,720 --> 00:00:52,445
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    74
many, many times before.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    75
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    76
17
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    77
00:00:52,445 --> 00:00:55,100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    78
But just to be on the same page,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    79
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    80
18
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    81
00:00:55,100 --> 00:00:57,110
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    82
let me just recap them.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    83
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    84
19
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    85
00:00:57,110 --> 00:00:59,210
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    86
So here in this line,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    87
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    88
20
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    89
00:00:59,210 --> 00:01:01,790
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    90
there is a regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    91
which is supposed to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    92
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    93
21
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    94
00:01:01,790 --> 00:01:05,285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    95
recognize some form
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    96
of email addresses.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    97
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    98
22
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
    99
00:01:05,285 --> 00:01:07,745
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   100
So an e-mail address
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   101
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   102
23
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   103
00:01:07,745 --> 00:01:11,000
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   104
has part which is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   105
before the @ symbol,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   106
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   107
24
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   108
00:01:11,000 --> 00:01:13,400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   109
which is the name of the person.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   110
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   111
25
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   112
00:01:13,400 --> 00:01:16,880
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   113
And that can be
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   114
any number between
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   115
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   116
26
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   117
00:01:16,880 --> 00:01:20,195
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   118
0 and 9, and letters between a and z.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   119
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   120
27
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   121
00:01:20,195 --> 00:01:24,155
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   122
Let's say we avoiding
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   123
here capital letters.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   124
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   125
28
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   126
00:01:24,155 --> 00:01:26,045
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   127
There can be underscores.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   128
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   129
29
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   130
00:01:26,045 --> 00:01:29,405
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   131
There can be a dot and
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   132
there can be hyphens.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   133
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   134
30
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   135
00:01:29,405 --> 00:01:35,390
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   136
And after the @ symbol
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   137
comes the domain name.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   138
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   139
31
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   140
00:01:35,390 --> 00:01:37,310
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   141
So as you can see here,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   142
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   143
32
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   144
00:01:37,310 --> 00:01:40,640
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   145
we use things like star to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   146
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   147
33
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   148
00:01:40,640 --> 00:01:44,314
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   149
match letters
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   150
zero or more times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   151
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   152
34
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   153
00:01:44,314 --> 00:01:45,985
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   154
Or we have a plus,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   155
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   156
35
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   157
00:01:45,985 --> 00:01:47,420
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   158
which means you have to match
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   159
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   160
36
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   161
00:01:47,420 --> 00:01:52,489
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   162
at least once or more
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   163
times. Then we have.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   164
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   165
37
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   166
00:01:52,489 --> 00:01:55,790
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   167
question mark, which says you
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   168
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   169
38
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   170
00:01:55,790 --> 00:01:59,105
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   171
match either it is there
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   172
or it ss not there.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   173
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   174
39
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   175
00:01:59,105 --> 00:02:01,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   176
You are also regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   177
expressions which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   178
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   179
40
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   180
00:02:01,340 --> 00:02:03,755
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   181
match exactly n-times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   182
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   183
41
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   184
00:02:03,755 --> 00:02:08,720
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   185
Or this is a regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   186
for between n and m times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   187
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   188
42
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   189
00:02:08,720 --> 00:02:12,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   190
You can see in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   191
this email address,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   192
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   193
43
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   194
00:02:12,065 --> 00:02:13,730
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   195
the top-level domain
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   196
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   197
44
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   198
00:02:13,730 --> 00:02:16,130
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   199
name can be any letter 
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   200
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   201
45
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   202
00:02:16,130 --> 00:02:19,265
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   203
between a to z,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   204
and contain dots,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   205
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   206
46
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   207
00:02:19,265 --> 00:02:22,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   208
but can only be two
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   209
characters long
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   210
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   211
47
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   212
00:02:22,340 --> 00:02:25,685
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   213
up till six characters
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   214
and not more.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   215
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   216
48
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   217
00:02:25,685 --> 00:02:29,240
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   218
Then you also have
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   219
something like ranges.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   220
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   221
49
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   222
00:02:29,240 --> 00:02:31,220
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   223
So you can see, letters between a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   224
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   225
50
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   226
00:02:31,220 --> 00:02:33,635
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   227
and z and 0 to 9 and so on.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   228
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   229
51
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   230
00:02:33,635 --> 00:02:36,545
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   231
Here you also have regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   232
expression which can
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   233
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   234
52
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   235
00:02:36,545 --> 00:02:40,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   236
match something which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   237
isn't in this range.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   238
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   239
53
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   240
00:02:40,070 --> 00:02:42,560
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   241
So for example, if
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   242
you want for example match,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   243
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   244
54
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   245
00:02:42,560 --> 00:02:44,030
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   246
letters but not numbers,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   247
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   248
55
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   249
00:02:44,030 --> 00:02:45,800
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   250
you would say, well, if
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   251
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   252
56
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   253
00:02:45,800 --> 00:02:48,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   254
this is a number that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   255
should not match.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   256
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   257
57
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   258
00:02:49,090 --> 00:02:52,804
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   259
Typically you also
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   260
have these ranges.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   261
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   262
58
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   263
00:02:52,804 --> 00:02:55,565
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   264
Lowercase letters,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   265
capital letters.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   266
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   267
59
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   268
00:02:55,565 --> 00:02:58,550
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   269
Then you have some
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   270
special regular expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   271
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   272
60
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   273
00:02:58,550 --> 00:03:02,195
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   274
like this one is only
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   275
supposed to match digits.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   276
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   277
61
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   278
00:03:02,195 --> 00:03:05,674
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   279
A dot is supposed to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   280
match any character.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   281
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   282
62
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   283
00:03:05,674 --> 00:03:07,370
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   284
And then they have also something
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   286
63
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   287
00:03:07,370 --> 00:03:09,800
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   288
called groups which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   289
is supposed to be
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   290
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   291
64
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   292
00:03:09,800 --> 00:03:12,799
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   293
used when you are
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   294
trying to extract
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   295
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   296
65
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   297
00:03:12,799 --> 00:03:15,605
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   298
a string you've matched.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   299
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   300
66
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   301
00:03:15,605 --> 00:03:19,925
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   302
Okay, so these are the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   303
typical regular expressions.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   304
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   305
67
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   306
00:03:19,925 --> 00:03:23,075
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   307
And here's a particular one.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   308
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   309
68
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   310
00:03:23,075 --> 00:03:25,820
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   311
Trying to match something
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   312
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   313
69
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   314
00:03:25,820 --> 00:03:28,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   315
which resembles
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   316
an email address.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   317
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   318
70
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   319
00:03:29,590 --> 00:03:33,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   320
Clearly that should be all easy.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   321
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   322
71
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   323
00:03:33,065 --> 00:03:36,230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   324
And our technology should
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   325
be on top of that.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   326
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   327
72
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   328
00:03:36,230 --> 00:03:37,865
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   329
That we can take a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   330
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   331
73
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   332
00:03:37,865 --> 00:03:41,015
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   333
regular expressions and
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   334
we can take a string,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   336
74
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   337
00:03:41,015 --> 00:03:43,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   338
and we should have programs to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   339
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   340
75
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   341
00:03:43,340 --> 00:03:45,680
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   342
decide whether this
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   343
string is matched
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   344
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   345
76
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   346
00:03:45,680 --> 00:03:50,330
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   347
by a regular expression or
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   348
not and should be easy-peasy, no?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   349
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   350
77
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   351
00:03:50,330 --> 00:03:56,150
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   352
Well, let's have a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   353
look at two examples.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   354
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   355
78
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   356
00:03:56,150 --> 00:04:00,860
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   357
The first regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   358
is a star star b.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   359
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   360
79
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   361
00:04:00,860 --> 00:04:02,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   362
And it is supposed
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   363
to match strings of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   364
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   365
80
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   366
00:04:02,990 --> 00:04:05,825
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   367
the form 0 or more a's,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   368
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   369
81
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   370
00:04:05,825 --> 00:04:10,385
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   371
followed by a b. The parentheses
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   372
you can ignore.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   373
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   374
82
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   375
00:04:10,385 --> 00:04:11,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   376
And a star star
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   377
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   378
83
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   379
00:04:11,990 --> 00:04:14,120
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   380
also doesn't
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   381
make any difference
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   382
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   383
84
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   384
00:04:14,120 --> 00:04:16,505
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   385
to what kind of strings
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   386
that can be matched.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   387
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   388
85
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   389
00:04:16,505 --> 00:04:21,635
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   390
It can only make 0 more
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   391
a's followed by a b.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   392
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   393
86
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   394
00:04:21,635 --> 00:04:23,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   395
And the other regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   396
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   397
87
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   398
00:04:23,900 --> 00:04:26,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   399
is possibly a character a,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   401
88
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   402
00:04:26,990 --> 00:04:32,930
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   403
n times, followed by character
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   404
a axactly n-times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   405
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   406
89
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   407
00:04:32,930 --> 00:04:35,570
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   408
And we will try out
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   409
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   410
90
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   411
00:04:35,570 --> 00:04:38,360
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   412
these two regular expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   413
with strings of the form a,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   414
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   415
91
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   416
00:04:38,360 --> 00:04:39,890
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   417
aa, and so on,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   418
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   419
92
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   420
00:04:39,890 --> 00:04:45,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   421
and up to the length of n. And
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   422
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   423
93
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   424
00:04:45,770 --> 00:04:49,130
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   425
this regular expression should
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   426
actually not match any of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   427
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   428
94
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   429
00:04:49,130 --> 00:04:53,315
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   430
the strings because the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   431
final b is missing.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   432
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   433
95
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   434
00:04:53,315 --> 00:04:56,150
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   435
But that is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   436
okay. For example
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   437
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   438
96
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   439
00:04:56,150 --> 00:04:57,425
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   440
if you have a regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   441
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   442
97
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   443
00:04:57,425 --> 00:05:00,110
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   444
that is supposed to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   445
check whether a string is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   446
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   447
98
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   448
00:05:00,110 --> 00:05:01,490
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   449
an email address and the user
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   450
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   451
99
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   452
00:05:01,490 --> 00:05:03,380
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   453
gives some random
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   454
strings in there,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   455
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   456
100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   457
00:05:03,380 --> 00:05:06,545
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   458
then this regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   459
should not match that string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   460
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   461
101
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   462
00:05:06,545 --> 00:05:08,420
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   463
And for this regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   464
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   465
102
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   466
00:05:08,420 --> 00:05:11,195
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   467
you have to scratch a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   468
little bit of your head,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   469
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   470
103
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   471
00:05:11,195 --> 00:05:12,620
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   472
what it can actually match.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   473
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   474
104
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   475
00:05:12,620 --> 00:05:14,720
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   476
But after a little bit
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   477
of head scratching,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   478
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   479
105
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   480
00:05:14,720 --> 00:05:18,260
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   481
you find out can match
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   482
any string which is of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   483
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   484
106
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   485
00:05:18,260 --> 00:05:22,580
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   486
the length n a's up
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   487
to 2n of a's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   488
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   489
107
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   490
00:05:22,580 --> 00:05:24,290
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   491
So anything in this range,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   492
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   493
108
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   494
00:05:24,290 --> 00:05:27,185
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   495
this regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   496
can actually match.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   497
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   498
109
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   499
00:05:27,185 --> 00:05:30,395
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   500
Okay, let's
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   501
take a random tool,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   502
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   503
110
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   504
00:05:30,395 --> 00:05:32,630
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   505
maybe for example Python.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   506
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   507
111
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   508
00:05:32,630 --> 00:05:35,240
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   509
So here's a little
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   510
Python program.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   511
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   512
112
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   513
00:05:35,240 --> 00:05:38,690
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   514
It uses the library
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   515
function of Python to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   516
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   517
113
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   518
00:05:38,690 --> 00:05:42,935
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   519
match the regular expressions of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   520
a star star b.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   521
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   522
114
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   523
00:05:42,935 --> 00:05:46,805
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   524
And we measure time with longer
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   525
and longer strings of a.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   526
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   527
115
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   528
00:05:46,805 --> 00:05:48,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   529
And so conveniently we can give
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   530
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   531
116
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   532
00:05:48,770 --> 00:05:51,140
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   533
the number of a's here
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   534
on the command line.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   535
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   536
117
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   537
00:05:51,140 --> 00:05:56,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   538
If I just call
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   539
this on the command line,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   540
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   541
118
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   542
00:05:56,900 --> 00:05:59,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   543
Let's say we first
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   544
start with five a's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   545
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   546
119
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   547
00:05:59,900 --> 00:06:03,920
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   548
And I get also the times which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   549
in this case is next to nothing.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   550
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   551
120
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   552
00:06:03,920 --> 00:06:05,960
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   553
And here's the string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   554
we just matched.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   555
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   556
121
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   557
00:06:05,960 --> 00:06:07,640
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   558
And obviously the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   559
regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   560
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   561
122
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   562
00:06:07,640 --> 00:06:09,110
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   563
did not match the string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   564
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   565
123
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   566
00:06:09,110 --> 00:06:11,255
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   567
That's indicated by this none.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   568
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   569
124
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   570
00:06:11,255 --> 00:06:13,925
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   571
Let's take ten a's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   572
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   573
125
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   574
00:06:13,925 --> 00:06:16,490
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   575
It's also pretty quick.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   576
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   577
126
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   578
00:06:16,490 --> 00:06:20,780
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   579
Fifteen a's, even quicker,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   580
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   581
127
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   582
00:06:20,780 --> 00:06:23,180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   583
but these times always need to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   584
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   585
128
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   586
00:06:23,180 --> 00:06:25,820
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   587
be taken with a grain of salt.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   588
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   589
129
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   590
00:06:25,820 --> 00:06:28,040
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   591
They are not 100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   592
percent accurate.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   593
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   594
130
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   595
00:06:28,040 --> 00:06:31,490
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   596
So 15 is also a let's take
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   597
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   598
131
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   599
00:06:31,490 --> 00:06:36,965
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   600
28th notes already
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   601
double the time.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   602
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   603
132
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   604
00:06:36,965 --> 00:06:42,440
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   605
Twenty-five longer.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   606
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   607
133
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   608
00:06:42,440 --> 00:06:45,680
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   609
Okay, that suddenly
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   610
from 02 seconds,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   611
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   612
134
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   613
00:06:45,680 --> 00:06:48,960
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   614
it takes almost four seconds.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   615
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   616
135
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   617
00:06:49,600 --> 00:06:54,890
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   618
Six this
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   619
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   620
136
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   621
00:06:54,890 --> 00:07:01,415
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   622
takes six seconds
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   623
already Double, okay?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   624
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   625
137
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   626
00:07:01,415 --> 00:07:07,229
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   627
Go to 28. That would be now.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   628
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   629
138
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   630
00:07:08,890 --> 00:07:11,840
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   631
You see the string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   632
isn't very long,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   633
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   634
139
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   635
00:07:11,840 --> 00:07:13,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   636
so that could be easily like
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   637
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   638
140
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   639
00:07:13,340 --> 00:07:16,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   640
just the size of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   641
an email address.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   642
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   643
141
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   644
00:07:16,070 --> 00:07:19,280
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   645
And the regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   646
expression matching
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   647
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   648
142
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   649
00:07:19,280 --> 00:07:22,550
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   650
engine in Python needs
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   651
quite a long time
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   652
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   653
143
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   654
00:07:22,550 --> 00:07:24,710
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   655
to find out that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   656
this string of 28
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   657
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   658
144
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   659
00:07:24,710 --> 00:07:26,570
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   660
AES is actually not much
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   661
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   662
145
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   663
00:07:26,570 --> 00:07:28,490
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   664
by that you see it's
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   665
still not finished.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   666
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   667
146
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   668
00:07:28,490 --> 00:07:32,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   669
I think it should take
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   670
approximately like 20 seconds.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   671
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   672
147
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   673
00:07:32,900 --> 00:07:34,400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   674
Okay. Already 30.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   675
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   676
148
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   677
00:07:34,400 --> 00:07:36,530
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   678
And if we would try
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   679
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   680
149
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   681
00:07:36,530 --> 00:07:40,805
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   682
30 would be already
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   683
more than a minute.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   684
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   685
150
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   686
00:07:40,805 --> 00:07:43,940
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   687
And if I could read
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   688
something like hundreds,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   689
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   690
151
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   691
00:07:43,940 --> 00:07:46,220
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   692
you remember if a doubling in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   693
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   694
152
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   695
00:07:46,220 --> 00:07:48,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   696
each step or the second step,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   697
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   698
153
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   699
00:07:48,770 --> 00:07:50,720
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   700
the story with the chess board,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   701
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   702
154
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   703
00:07:50,720 --> 00:07:53,855
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   704
we probably would sit here
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   705
until the next century.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   706
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   707
155
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   708
00:07:53,855 --> 00:07:56,820
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   709
So something strange here.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   710
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   711
156
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   712
00:07:57,580 --> 00:08:01,355
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   713
Okay, that might be just
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   714
a problem of Python.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   715
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   716
157
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   717
00:08:01,355 --> 00:08:02,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   718
Let's have a look at another
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   719
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   720
158
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   721
00:08:02,990 --> 00:08:04,985
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   722
regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   723
matching engine.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   724
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   725
159
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   726
00:08:04,985 --> 00:08:06,890
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   727
This time from JavaScript,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   728
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   729
160
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   730
00:08:06,890 --> 00:08:10,040
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   731
also are pretty well-known
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   732
programming language.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   733
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   734
161
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   735
00:08:10,040 --> 00:08:13,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   736
So here you can see
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   737
it's still a star,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   738
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   739
162
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   740
00:08:13,610 --> 00:08:16,235
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   741
star followed by b,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   742
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   743
163
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   744
00:08:16,235 --> 00:08:18,920
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   745
by direct expression is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   746
supposed to match that from
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   747
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   748
164
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   749
00:08:18,920 --> 00:08:21,830
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   750
the beginning of the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   751
string up till the end.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   752
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   753
165
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   754
00:08:21,830 --> 00:08:23,930
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   755
So there's not any difference
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   756
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   757
166
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   758
00:08:23,930 --> 00:08:26,150
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   759
in the strings this work
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   760
expression matches.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   761
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   762
167
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   763
00:08:26,150 --> 00:08:28,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   764
We'll just start at the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   765
beginning of the string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   766
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   767
168
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   768
00:08:28,610 --> 00:08:31,460
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   769
and finish at the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   770
end of the string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   771
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   772
169
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   773
00:08:31,460 --> 00:08:35,285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   774
And we again, we just use
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   775
repeated A's for that.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   776
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   777
170
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   778
00:08:35,285 --> 00:08:38,195
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   779
And similarly, we can
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   780
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   781
171
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   782
00:08:38,195 --> 00:08:41,930
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   783
call it on the command line
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   784
and can do some timing.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   785
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   786
172
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   787
00:08:41,930 --> 00:08:44,540
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   788
So ten SBA, good.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   789
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   790
173
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   791
00:08:44,540 --> 00:08:46,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   792
Here's the string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   793
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   794
174
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   795
00:08:46,340 --> 00:08:48,320
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   796
It cannot match that string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   797
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   798
175
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   799
00:08:48,320 --> 00:08:50,525
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   800
And it's pretty fast.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   801
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   802
176
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   803
00:08:50,525 --> 00:08:54,725
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   804
Friendly. Although pretty fast.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   805
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   806
177
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   807
00:08:54,725 --> 00:08:59,120
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   808
Five, again,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   809
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   810
178
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   811
00:08:59,120 --> 00:09:06,650
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   812
somehow is kind of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   813
threshold that is 25, 26.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   814
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   815
179
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   816
00:09:06,650 --> 00:09:09,485
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   817
Suddenly it takes much longer.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   818
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   819
180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   820
00:09:09,485 --> 00:09:14,360
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   821
And it has essentially the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   822
same problem as with Python.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   823
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   824
181
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   825
00:09:14,360 --> 00:09:17,165
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   826
So you'll see in now from 26 on,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   827
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   828
182
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   829
00:09:17,165 --> 00:09:19,250
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   830
the Times has always
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   831
doubling from
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   832
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   833
183
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   834
00:09:19,250 --> 00:09:21,860
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   835
three seconds to seven seconds.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   836
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   837
184
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   838
00:09:21,860 --> 00:09:23,330
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   839
So you can imagine what that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   840
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   841
185
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   842
00:09:23,330 --> 00:09:24,890
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   843
roughly takes when I put your
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   844
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   845
186
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   846
00:09:24,890 --> 00:09:30,230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   847
27 and you see the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   848
string isn't very long.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   849
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   850
187
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   851
00:09:30,230 --> 00:09:32,165
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   852
Let's choose twenties or maize.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   853
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   854
188
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   855
00:09:32,165 --> 00:09:35,419
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   856
Imagine you have to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   857
search a database
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   858
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   859
189
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   860
00:09:35,419 --> 00:09:38,720
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   861
with kilobytes of data.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   862
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   863
190
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   864
00:09:38,720 --> 00:09:42,260
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   865
This, these regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   866
expressions that would years
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   867
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   868
191
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   869
00:09:42,260 --> 00:09:48,150
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   870
need years to go through with
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   871
these regular expressions.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   872
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   873
192
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   874
00:09:48,630 --> 00:09:51,850
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   875
Okay, maybe the people in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   876
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   877
193
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   878
00:09:51,850 --> 00:09:55,435
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   879
Python and JavaScript,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   880
they're just idiots.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   881
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   882
194
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   883
00:09:55,435 --> 00:09:58,180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   884
Surely Java must do much better.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   885
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   886
195
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   887
00:09:58,180 --> 00:10:01,045
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   888
So here's a program.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   889
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   890
196
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   891
00:10:01,045 --> 00:10:03,415
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   892
You can see this again
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   893
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   894
197
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   895
00:10:03,415 --> 00:10:05,980
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   896
is the reg expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   897
and we just having
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   898
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   899
198
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   900
00:10:05,980 --> 00:10:08,320
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   901
some scaffolding to generate
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   902
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   903
199
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   904
00:10:08,320 --> 00:10:11,905
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   905
strings from five up till 28.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   906
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   907
200
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   908
00:10:11,905 --> 00:10:14,305
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   909
And if we run that,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   910
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   911
201
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   912
00:10:14,305 --> 00:10:16,660
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   913
actually does that automatically.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   914
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   915
202
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   916
00:10:16,660 --> 00:10:19,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   917
So uphill 19, pretty fast,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   918
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   919
203
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   920
00:10:19,900 --> 00:10:24,925
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   921
but then starting from
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   922
23, skidding pretty slow.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   923
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   924
204
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   925
00:10:24,925 --> 00:10:27,445
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   926
So the question is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   927
what's going on?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   928
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   929
205
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   930
00:10:27,445 --> 00:10:29,230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   931
By the way, I'm not quoting here.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   932
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   933
206
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   934
00:10:29,230 --> 00:10:33,755
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   935
Scala, using internally
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   936
the regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   937
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   938
207
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   939
00:10:33,755 --> 00:10:36,665
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   940
matching engine from Java.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   941
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   942
208
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   943
00:10:36,665 --> 00:10:39,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   944
So would have exactly
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   945
the same problem.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   946
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   947
209
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   948
00:10:39,065 --> 00:10:41,480
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   949
Also, I have been
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   950
here very careful,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   951
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   952
210
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   953
00:10:41,480 --> 00:10:43,550
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   954
I'm using here Scala aid,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   955
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   956
211
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   957
00:10:43,550 --> 00:10:46,085
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   958
which nowadays is quite old.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   959
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   960
212
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   961
00:10:46,085 --> 00:10:50,765
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   962
But you will see also
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   963
current Java versions.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   964
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   965
213
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   966
00:10:50,765 --> 00:10:55,490
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   967
We will see we can out-compete
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   968
them by magnitudes.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   969
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   970
214
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   971
00:10:55,490 --> 00:10:57,605
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   972
So I think I can that.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   973
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   974
215
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   975
00:10:57,605 --> 00:10:59,165
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   976
Now, just finish here.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   977
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   978
216
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   979
00:10:59,165 --> 00:11:04,025
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   980
You see the problem. Just
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   981
for completeness sake.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   982
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   983
217
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   984
00:11:04,025 --> 00:11:07,010
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   985
Here is a Ruby program.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   986
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   987
218
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   988
00:11:07,010 --> 00:11:09,935
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   989
This is using the other
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   990
regular expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   991
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   992
219
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   993
00:11:09,935 --> 00:11:12,935
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   994
In this case the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   995
string should match.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   996
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   997
220
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   998
00:11:12,935 --> 00:11:20,300
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
   999
And again it tries out
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1000
strings between 130 here.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1001
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1002
221
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1003
00:11:20,300 --> 00:11:23,450
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1004
That's a program actually
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1005
a former student produced.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1006
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1007
222
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1008
00:11:23,450 --> 00:11:25,565
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1009
And you can see four a's
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1010
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1011
223
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1012
00:11:25,565 --> 00:11:29,780
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1013
of links up till 20
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1014
AES is pretty fast.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1015
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1016
224
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1017
00:11:29,780 --> 00:11:32,495
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1018
But then starting at 26,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1019
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1020
225
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1021
00:11:32,495 --> 00:11:35,285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1022
it's getting really slow.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1023
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1024
226
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1025
00:11:35,285 --> 00:11:37,100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1026
So in this case,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1027
remember the string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1028
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1029
227
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1030
00:11:37,100 --> 00:11:38,870
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1031
is actually matched by
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1032
the regular expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1033
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1034
228
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1035
00:11:38,870 --> 00:11:40,130
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1036
So it has nothing to do
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1037
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1038
229
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1039
00:11:40,130 --> 00:11:41,540
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1040
with a regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1041
expression actually
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1042
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1043
230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1044
00:11:41,540 --> 00:11:45,485
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1045
matches a string or does
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1046
not match a string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1047
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1048
231
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1049
00:11:45,485 --> 00:11:48,260
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1050
I admit though these
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1051
regular expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1052
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1053
232
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1054
00:11:48,260 --> 00:11:49,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1055
are carefully chosen,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1056
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1057
233
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1058
00:11:49,610 --> 00:11:52,250
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1059
as you will see later on.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1060
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1061
234
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1062
00:11:52,250 --> 00:11:55,620
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1063
Hey, I also just stop that here.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1064
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1065
235
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1066
00:11:55,710 --> 00:12:00,985
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1067
Okay, this slight collect
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1068
this information about times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1069
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1070
236
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1071
00:12:00,985 --> 00:12:03,400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1072
On the right hand side will
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1073
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1074
237
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1075
00:12:03,400 --> 00:12:05,860
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1076
be our regular expression mantra,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1077
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1078
238
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1079
00:12:05,860 --> 00:12:08,290
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1080
which we implement next week.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1081
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1082
239
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1083
00:12:08,290 --> 00:12:10,795
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1084
On the left-hand side,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1085
are these times by
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1086
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1087
240
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1088
00:12:10,795 --> 00:12:14,260
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1089
barriers than regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1090
expression matching engines?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1091
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1092
241
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1093
00:12:14,260 --> 00:12:17,809
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1094
On the top is this
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1095
regular expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1096
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1097
242
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1098
00:12:19,080 --> 00:12:23,335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1099
Possible a n times a n times.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1101
243
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1102
00:12:23,335 --> 00:12:26,890
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1103
And on the lowest
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1104
is a star, star b.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1105
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1106
244
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1107
00:12:26,890 --> 00:12:30,370
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1108
And the x-axis show here
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1109
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1110
245
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1111
00:12:30,370 --> 00:12:35,335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1112
the length of the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1113
string. How many a's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1114
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1115
246
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1116
00:12:35,335 --> 00:12:38,925
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1117
And on the y axis is the time.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1118
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1119
247
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1120
00:12:38,925 --> 00:12:41,660
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1121
They need to decide whether
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1122
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1123
248
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1124
00:12:41,660 --> 00:12:44,615
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1125
the string is matched by
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1126
the rate expression or not.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1127
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1128
249
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1129
00:12:44,615 --> 00:12:46,415
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1130
So you can see here, Python,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1131
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1132
250
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1133
00:12:46,415 --> 00:12:47,945
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1134
Java eight in JavaScript,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1135
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1136
251
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1137
00:12:47,945 --> 00:12:52,250
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1138
they max out approximately
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1139
at between 2530.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1140
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1141
252
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1142
00:12:52,250 --> 00:12:53,900
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1143
The kristin, it takes already
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1144
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1145
253
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1146
00:12:53,900 --> 00:12:55,160
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1147
a half a minute to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1148
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1149
254
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1150
00:12:55,160 --> 00:12:57,410
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1151
decide whether the string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1152
is matched or not.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1153
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1154
255
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1155
00:12:57,410 --> 00:13:00,815
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1156
And similarly, in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1157
the other example,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1158
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1159
256
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1160
00:13:00,815 --> 00:13:03,830
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1161
Python and derived Ruby max out
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1162
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1163
257
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1164
00:13:03,830 --> 00:13:07,220
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1165
at a similar kind of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1166
length of the strings.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1167
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1168
258
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1169
00:13:07,220 --> 00:13:10,400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1170
Because then they use also
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1171
half a minute to decide
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1172
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1173
259
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1174
00:13:10,400 --> 00:13:13,940
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1175
whether this rec expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1176
actually matches the string.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1177
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1178
260
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1179
00:13:13,940 --> 00:13:16,790
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1180
Contrast that with
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1181
the reg expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1182
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1183
261
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1184
00:13:16,790 --> 00:13:19,235
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1185
which we are regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1186
expression mantra,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1187
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1188
262
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1189
00:13:19,235 --> 00:13:21,470
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1190
which we're going to implement.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1191
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1192
263
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1193
00:13:21,470 --> 00:13:25,040
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1194
This can match
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1195
approximately 10 thousand
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1196
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1197
264
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1198
00:13:25,040 --> 00:13:30,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1199
a's in this example and
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1200
needs less than ten seconds.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1201
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1202
265
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1203
00:13:30,065 --> 00:13:32,285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1204
Actually, there will be
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1205
two versions of that.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1206
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1207
266
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1208
00:13:32,285 --> 00:13:34,850
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1209
First version may be
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1210
also relatively slow.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1211
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1212
267
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1213
00:13:34,850 --> 00:13:36,410
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1214
But the second version,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1215
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1216
268
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1217
00:13:36,410 --> 00:13:38,240
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1218
in contrast to Python,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1219
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1220
269
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1221
00:13:38,240 --> 00:13:40,295
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1222
Ruby, we'll be blindingly fast.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1223
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1224
270
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1225
00:13:40,295 --> 00:13:42,380
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1226
And in the second example,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1227
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1228
271
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1229
00:13:42,380 --> 00:13:45,740
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1230
you have to be careful
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1231
about the x axis because
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1232
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1233
272
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1234
00:13:45,740 --> 00:13:49,385
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1235
that means four times
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1236
ten to the power six.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1237
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1238
273
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1239
00:13:49,385 --> 00:13:51,695
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1240
It's actually 4 million A's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1241
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1242
274
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1243
00:13:51,695 --> 00:13:55,100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1244
So our regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1245
expression match or need
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1246
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1247
275
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1248
00:13:55,100 --> 00:13:57,635
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1249
less than ten seconds to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1250
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1251
276
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1252
00:13:57,635 --> 00:14:00,725
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1253
match a string of length
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1254
of 4 million A's.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1255
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1256
277
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1257
00:14:00,725 --> 00:14:04,430
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1258
Contrast that Python, Java eight,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1259
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1260
278
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1261
00:14:04,430 --> 00:14:06,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1262
and JavaScript need half a minute
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1263
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1264
279
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1265
00:14:06,770 --> 00:14:09,905
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1266
already for a string
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1267
of length just 30,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1268
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1269
280
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1270
00:14:09,905 --> 00:14:12,365
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1271
unless you're very
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1272
careful with Java eight.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1273
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1274
281
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1275
00:14:12,365 --> 00:14:15,725
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1276
Yes, Java nine and above,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1277
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1278
282
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1279
00:14:15,725 --> 00:14:17,180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1280
they already have
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1281
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1282
283
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1283
00:14:17,180 --> 00:14:19,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1284
a much better regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1285
expression matching engine,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1286
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1287
284
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1288
00:14:19,610 --> 00:14:22,805
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1289
but still we will be running
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1290
circles around them.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1291
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1292
285
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1293
00:14:22,805 --> 00:14:27,050
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1294
It's this data. I
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1295
call this slide.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1296
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1297
286
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1298
00:14:27,050 --> 00:14:29,675
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1299
Why bother with
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1300
regular expressions?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1301
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1302
287
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1303
00:14:29,675 --> 00:14:33,515
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1304
But you can probably
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1305
see these are
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1306
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1307
288
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1308
00:14:33,515 --> 00:14:34,910
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1309
at least more times by
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1310
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1311
289
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1312
00:14:34,910 --> 00:14:38,015
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1313
the existing regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1314
expression matching engines.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1315
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1316
290
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1317
00:14:38,015 --> 00:14:40,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1318
And it's actually
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1319
surprising that after
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1320
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1321
291
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1322
00:14:40,070 --> 00:14:42,695
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1323
one lecture we can already
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1324
do substantially better.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1325
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1326
292
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1327
00:14:42,695 --> 00:14:47,495
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1328
And if you don't believe
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1329
in D times, I gave here,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1330
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1331
293
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1332
00:14:47,495 --> 00:14:50,090
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1333
please feel free to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1334
play on your own
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1336
294
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1337
00:14:50,090 --> 00:14:52,865
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1338
with the examples
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1339
I uploaded, Keats.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1341
295
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1342
00:14:52,865 --> 00:14:55,235
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1343
These are exactly the programs
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1344
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1345
296
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1346
00:14:55,235 --> 00:14:57,470
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1347
are used here in the examples.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1348
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1349
297
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1350
00:14:57,470 --> 00:14:59,255
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1351
So feel free.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1352
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1353
298
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1354
00:14:59,255 --> 00:15:01,970
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1355
You might however now think, hmm.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1356
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1357
299
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1358
00:15:01,970 --> 00:15:05,449
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1359
These are two very
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1360
well chosen examples.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1361
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1362
300
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1363
00:15:05,449 --> 00:15:07,145
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1364
And I admit that's true.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1365
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1366
301
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1367
00:15:07,145 --> 00:15:09,410
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1368
And such problem there never
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1369
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1370
302
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1371
00:15:09,410 --> 00:15:12,540
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1372
causing any problems
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1373
in real life.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1374
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1375
303
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1376
00:15:13,300 --> 00:15:15,980
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1377
Regular expressions are used very
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1378
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1379
304
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1380
00:15:15,980 --> 00:15:19,415
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1381
frequently and they
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1382
do cause problems.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1383
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1384
305
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1385
00:15:19,415 --> 00:15:21,410
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1386
So here's my first example from
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1387
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1388
306
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1389
00:15:21,410 --> 00:15:23,885
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1390
a company called cloudflare.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1391
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1392
307
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1393
00:15:23,885 --> 00:15:27,560
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1394
This is a huge hosting company
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1395
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1396
308
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1397
00:15:27,560 --> 00:15:30,935
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1398
which host very
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1399
well-known web pages.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1401
309
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1402
00:15:30,935 --> 00:15:34,970
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1403
And they really try hard
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1404
to have no outage at all.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1405
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1406
310
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1407
00:15:34,970 --> 00:15:37,340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1408
And they manage
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1409
that for six years.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1410
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1411
311
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1412
00:15:37,340 --> 00:15:39,320
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1413
But then a Rekha expression,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1414
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1415
312
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1416
00:15:39,320 --> 00:15:41,180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1417
actually this one caused
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1418
a problem and you
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1419
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1420
313
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1421
00:15:41,180 --> 00:15:43,265
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1422
can see they're also
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1423
like two stars.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1424
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1425
314
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1426
00:15:43,265 --> 00:15:44,630
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1427
They are at the end.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1428
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1429
315
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1430
00:15:44,630 --> 00:15:46,955
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1431
And because of that string needed
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1432
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1433
316
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1434
00:15:46,955 --> 00:15:49,865
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1435
too much time to be matched.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1436
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1437
317
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1438
00:15:49,865 --> 00:15:50,990
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1439
And because of that,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1440
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1441
318
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1442
00:15:50,990 --> 00:15:52,430
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1443
they had some outage for,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1444
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1445
319
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1446
00:15:52,430 --> 00:15:54,125
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1447
I think several hours,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1448
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1449
320
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1450
00:15:54,125 --> 00:15:57,920
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1451
actually in their malware
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1452
detection subsystem.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1453
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1454
321
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1455
00:15:57,920 --> 00:16:02,060
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1456
And the second example
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1457
comes from 2016,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1458
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1459
322
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1460
00:16:02,060 --> 00:16:04,040
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1461
where Stack Exchange,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1462
I guess you know
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1463
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1464
323
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1465
00:16:04,040 --> 00:16:06,650
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1466
this webpage had
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1467
also an outage from,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1468
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1469
324
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1470
00:16:06,650 --> 00:16:08,390
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1471
I think at least an hour.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1472
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1473
325
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1474
00:16:08,390 --> 00:16:13,070
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1475
Because a regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1476
then needed to format posts,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1477
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1478
326
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1479
00:16:13,070 --> 00:16:15,575
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1480
needed too much time to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1481
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1482
327
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1483
00:16:15,575 --> 00:16:19,010
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1484
recognize whether this post
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1485
should be accepted or not.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1486
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1487
328
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1488
00:16:19,010 --> 00:16:23,390
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1489
And again, there was a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1490
semi kind of problem.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1491
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1492
329
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1493
00:16:23,390 --> 00:16:24,950
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1494
And you can read
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1495
the stories behind
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1496
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1497
330
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1498
00:16:24,950 --> 00:16:28,080
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1499
that on these two given links.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1500
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1501
331
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1502
00:16:28,720 --> 00:16:31,730
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1503
When I looked at
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1504
this the first time,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1505
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1506
332
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1507
00:16:31,730 --> 00:16:34,175
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1508
what surprised me is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1509
that theoretician
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1510
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1511
333
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1512
00:16:34,175 --> 00:16:37,520
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1513
who sometimes dedicate their
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1514
life to regular expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1515
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1516
334
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1517
00:16:37,520 --> 00:16:39,440
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1518
And no really a lot about
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1519
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1520
335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1521
00:16:39,440 --> 00:16:41,690
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1522
them didn't know
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1523
anything about this.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1524
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1525
336
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1526
00:16:41,690 --> 00:16:43,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1527
But engineers, they
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1528
already created
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1529
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1530
337
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1531
00:16:43,610 --> 00:16:46,160
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1532
a name for that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1533
regular expression,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1534
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1535
338
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1536
00:16:46,160 --> 00:16:47,975
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1537
denial of service attack.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1538
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1539
339
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1540
00:16:47,975 --> 00:16:49,745
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1541
Because what you can,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1542
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1543
340
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1544
00:16:49,745 --> 00:16:51,230
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1545
what can happen now is that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1546
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1547
341
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1548
00:16:51,230 --> 00:16:54,920
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1549
attackers look for
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1550
certain strings.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1551
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1552
342
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1553
00:16:54,920 --> 00:16:56,780
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1554
You make your regular expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1555
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1556
343
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1557
00:16:56,780 --> 00:16:59,105
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1558
matching engine topple over.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1559
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1560
344
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1561
00:16:59,105 --> 00:17:01,370
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1562
And these kind of expressions,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1563
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1564
345
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1565
00:17:01,370 --> 00:17:04,160
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1566
regular expressions called
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1567
Eve of reg expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1568
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1569
346
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1570
00:17:04,160 --> 00:17:06,350
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1571
And actually there are
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1572
quite a number of them.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1573
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1574
347
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1575
00:17:06,350 --> 00:17:08,495
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1576
So you seen this one,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1577
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1578
348
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1579
00:17:08,495 --> 00:17:11,255
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1580
the first one, and the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1581
second one already.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1582
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1583
349
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1584
00:17:11,255 --> 00:17:13,400
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1585
But there are many, many more.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1586
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1587
350
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1588
00:17:13,400 --> 00:17:15,620
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1589
And you can easily have in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1590
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1591
351
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1592
00:17:15,620 --> 00:17:18,560
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1593
your program one of
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1594
these reg expression.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1595
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1596
352
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1597
00:17:18,560 --> 00:17:21,830
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1598
And then you have the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1599
problem that if you do have
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1600
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1601
353
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1602
00:17:21,830 --> 00:17:23,240
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1603
this regular expression and
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1604
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1605
354
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1606
00:17:23,240 --> 00:17:25,640
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1607
somebody finds the
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1608
corresponding string,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1609
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1610
355
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1611
00:17:25,640 --> 00:17:29,945
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1612
which make the records
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1613
matching engine topple over,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1614
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1615
356
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1616
00:17:29,945 --> 00:17:31,820
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1617
then you have a problem
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1618
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1619
357
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1620
00:17:31,820 --> 00:17:34,295
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1621
because your webpage is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1622
probably not variable.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1623
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1624
358
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1625
00:17:34,295 --> 00:17:36,140
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1626
This is also sometimes called
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1627
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1628
359
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1629
00:17:36,140 --> 00:17:39,350
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1630
this phenomenon,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1631
catastrophic backtracking.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1632
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1633
360
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1634
00:17:39,350 --> 00:17:43,595
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1635
In lecture three, we will
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1636
look at this more carefully.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1637
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1638
361
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1639
00:17:43,595 --> 00:17:46,910
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1640
And actually why that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1641
is such a problem in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1642
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1643
362
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1644
00:17:46,910 --> 00:17:50,795
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1645
real life is actually
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1646
not to do with Lexus.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1647
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1648
363
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1649
00:17:50,795 --> 00:17:53,180
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1650
Yes, regular
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1651
expressions are used as
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1652
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1653
364
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1654
00:17:53,180 --> 00:17:55,040
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1655
the basic tool for implementing
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1656
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1657
365
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1658
00:17:55,040 --> 00:17:57,185
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1659
like source bad reg expressions,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1660
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1661
366
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1662
00:17:57,185 --> 00:18:00,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1663
of course, used in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1664
a much wider area.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1665
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1666
367
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1667
00:18:00,065 --> 00:18:03,770
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1668
And they especially used for
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1669
network intrusion detection.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1670
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1671
368
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1672
00:18:03,770 --> 00:18:06,590
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1673
Remember, you having to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1674
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1675
369
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1676
00:18:06,590 --> 00:18:10,130
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1677
administer a big network
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1678
and you only want to let
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1679
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1680
370
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1681
00:18:10,130 --> 00:18:13,640
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1682
in packets which you think are K
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1683
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1684
371
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1685
00:18:13,640 --> 00:18:14,930
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1686
and you want to keep out
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1687
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1688
372
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1689
00:18:14,930 --> 00:18:17,645
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1690
any package which might
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1691
hack into your network.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1692
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1693
373
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1694
00:18:17,645 --> 00:18:22,670
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1695
So what they have is they
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1696
have suites of thousands and
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1697
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1698
374
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1699
00:18:22,670 --> 00:18:25,745
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1700
sometimes even more
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1701
regular expressions which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1702
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1703
375
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1704
00:18:25,745 --> 00:18:27,755
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1705
check whether this package
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1706
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1707
376
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1708
00:18:27,755 --> 00:18:30,065
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1709
satisfies some patterns or not.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1710
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1711
377
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1712
00:18:30,065 --> 00:18:31,460
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1713
And in this case it will be left
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1714
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1715
378
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1716
00:18:31,460 --> 00:18:34,205
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1717
out or it will be let in.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1718
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1719
379
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1720
00:18:34,205 --> 00:18:36,335
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1721
And with networks,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1722
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1723
380
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1724
00:18:36,335 --> 00:18:39,080
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1725
the problem is that our
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1726
hardware is already
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1727
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1728
381
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1729
00:18:39,080 --> 00:18:43,190
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1730
so fast that the reg expressions
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1731
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1732
382
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1733
00:18:43,190 --> 00:18:45,169
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1734
really become a bottleneck.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1735
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1736
383
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1737
00:18:45,169 --> 00:18:47,060
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1738
Because what do you do if now is
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1739
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1740
384
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1741
00:18:47,060 --> 00:18:49,880
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1742
suddenly a reg expression
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1743
takes too much time
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1744
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1745
385
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1746
00:18:49,880 --> 00:18:52,670
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1747
to just stop the matching
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1748
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1749
386
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1750
00:18:52,670 --> 00:18:55,100
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1751
and let the package
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1752
in regardless?
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1753
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1754
387
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1755
00:18:55,100 --> 00:18:58,190
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1756
Or do you just hold
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1757
the network up
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1758
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1759
388
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1760
00:18:58,190 --> 00:19:01,715
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1761
and don't let anything in
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1762
until you decided that.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1763
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1764
389
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1765
00:19:01,715 --> 00:19:04,895
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1766
So that's actually a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1767
really hard problem.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1768
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1769
390
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1770
00:19:04,895 --> 00:19:06,650
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1771
But the first time I came across
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1772
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1773
391
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1774
00:19:06,650 --> 00:19:09,965
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1775
that problem was actually
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1776
by this engineer.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1777
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1778
392
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1779
00:19:09,965 --> 00:19:13,820
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1780
And it's always say that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1781
Germans don't have any Yammer.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1782
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1783
393
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1784
00:19:13,820 --> 00:19:16,985
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1785
But I found that
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1786
video quite funny.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1787
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1788
394
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1789
00:19:16,985 --> 00:19:19,145
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1790
Maybe you have a
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1791
different opinion,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1792
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1793
395
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1794
00:19:19,145 --> 00:19:21,095
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1795
but feel free to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1796
have a look which
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1797
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1798
396
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1799
00:19:21,095 --> 00:19:23,705
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1800
explains exactly that problem.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1801
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1802
397
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1803
00:19:23,705 --> 00:19:25,610
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1804
So in the next video,
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1805
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1806
398
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1807
00:19:25,610 --> 00:19:28,445
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1808
we will start to
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1809
implement this matcher.
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1810
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1811
399
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1812
00:19:28,445 --> 00:19:30,870
fb07ac060866 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff changeset
  1813
So I hope to see you there.