761
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1 |
1
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
2 |
00:00:06,240 --> 00:00:11,050
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
3 |
Welcome back. This video
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
4 |
is about regular expressions.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
5 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
6 |
2
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
7 |
00:00:11,050 --> 00:00:14,230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
8 |
We want to use regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
9 |
expressions in our lexer.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
10 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
11 |
3
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
12 |
00:00:14,230 --> 00:00:16,165
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
13 |
And the purpose of the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
14 |
lexer is to find
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
15 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
16 |
4
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
17 |
00:00:16,165 --> 00:00:18,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
18 |
out where the words in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
19 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
20 |
5
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
21 |
00:00:18,070 --> 00:00:21,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
22 |
our programs are. However
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
23 |
regular expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
24 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
25 |
6
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
26 |
00:00:21,070 --> 00:00:23,875
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
27 |
are fundamental tool
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
28 |
in computer science.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
29 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
30 |
7
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
31 |
00:00:23,875 --> 00:00:27,910
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
32 |
And I'm sure you've used them
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
33 |
already on several occasions.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
34 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
35 |
8
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
36 |
00:00:27,910 --> 00:00:30,370
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
37 |
And one would expect that about
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
38 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
39 |
9
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
40 |
00:00:30,370 --> 00:00:31,750
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
41 |
regular expressions since they are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
42 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
43 |
10
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
44 |
00:00:31,750 --> 00:00:33,850
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
45 |
so well-known and well studied,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
46 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
47 |
11
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
48 |
00:00:33,850 --> 00:00:37,915
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
49 |
that everything under the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
50 |
sun is known about them.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
51 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
52 |
12
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
53 |
00:00:37,915 --> 00:00:41,080
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
54 |
But actually there's
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
55 |
still some surprising
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
56 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
57 |
13
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
58 |
00:00:41,080 --> 00:00:44,465
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
59 |
and interesting
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
60 |
problems with them.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
61 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
62 |
14
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
63 |
00:00:44,465 --> 00:00:47,945
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
64 |
And I want to show you
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
65 |
them in this video.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
66 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
67 |
15
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
68 |
00:00:47,945 --> 00:00:50,720
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
69 |
I'm sure you've seen
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
70 |
regular expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
71 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
72 |
16
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
73 |
00:00:50,720 --> 00:00:52,445
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
74 |
many, many times before.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
75 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
76 |
17
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
77 |
00:00:52,445 --> 00:00:55,100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
78 |
But just to be on the same page,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
79 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
80 |
18
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
81 |
00:00:55,100 --> 00:00:57,110
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
82 |
let me just recap them.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
83 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
84 |
19
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
85 |
00:00:57,110 --> 00:00:59,210
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
86 |
So here in this line,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
87 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
88 |
20
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
89 |
00:00:59,210 --> 00:01:01,790
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
90 |
there is a regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
91 |
which is supposed to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
92 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
93 |
21
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
94 |
00:01:01,790 --> 00:01:05,285
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
95 |
recognize some form
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
96 |
of email addresses.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
97 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
98 |
22
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
99 |
00:01:05,285 --> 00:01:07,745
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
100 |
So an e-mail address
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
101 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
102 |
23
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
103 |
00:01:07,745 --> 00:01:11,000
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
104 |
has part which is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
105 |
before the @ symbol,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
106 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
107 |
24
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
108 |
00:01:11,000 --> 00:01:13,400
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
109 |
which is the name of the person.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
110 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
111 |
25
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
112 |
00:01:13,400 --> 00:01:16,880
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
113 |
And that can be
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
114 |
any number between
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
115 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
116 |
26
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
117 |
00:01:16,880 --> 00:01:20,195
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
118 |
0 and 9, and letters between a and z.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
119 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
120 |
27
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
121 |
00:01:20,195 --> 00:01:24,155
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
122 |
Let's say we avoiding
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
123 |
here capital letters.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
124 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
125 |
28
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
126 |
00:01:24,155 --> 00:01:26,045
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
127 |
There can be underscores.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
128 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
129 |
29
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
130 |
00:01:26,045 --> 00:01:29,405
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
131 |
There can be a dot and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
132 |
there can be hyphens.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
133 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
134 |
30
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
135 |
00:01:29,405 --> 00:01:35,390
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
136 |
And after the @ symbol
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
137 |
comes the domain name.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
138 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
139 |
31
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
140 |
00:01:35,390 --> 00:01:37,310
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
141 |
So as you can see here,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
142 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
143 |
32
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
144 |
00:01:37,310 --> 00:01:40,640
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
145 |
we use things like star to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
146 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
147 |
33
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
148 |
00:01:40,640 --> 00:01:44,314
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
149 |
match letters
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
150 |
zero or more times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
151 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
152 |
34
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
153 |
00:01:44,314 --> 00:01:45,985
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
154 |
Or we have a plus,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
155 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
156 |
35
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
157 |
00:01:45,985 --> 00:01:47,420
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
158 |
which means you have to match
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
159 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
160 |
36
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
161 |
00:01:47,420 --> 00:01:52,489
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
162 |
at least once or more
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
163 |
times. Then we have.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
164 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
165 |
37
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
166 |
00:01:52,489 --> 00:01:55,790
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
167 |
question mark, which says you
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
168 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
169 |
38
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
170 |
00:01:55,790 --> 00:01:59,105
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
171 |
match either it is there
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
172 |
or it ss not there.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
173 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
174 |
39
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
175 |
00:01:59,105 --> 00:02:01,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
176 |
You are also regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
177 |
expressions which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
178 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
179 |
40
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
180 |
00:02:01,340 --> 00:02:03,755
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
181 |
match exactly n-times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
182 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
183 |
41
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
184 |
00:02:03,755 --> 00:02:08,720
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
185 |
Or this is a regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
186 |
for between n and m times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
187 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
188 |
42
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
189 |
00:02:08,720 --> 00:02:12,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
190 |
You can see in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
191 |
this email address,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
192 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
193 |
43
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
194 |
00:02:12,065 --> 00:02:13,730
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
195 |
the top-level domain
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
196 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
197 |
44
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
198 |
00:02:13,730 --> 00:02:16,130
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
199 |
name can be any letter
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
200 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
201 |
45
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
202 |
00:02:16,130 --> 00:02:19,265
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
203 |
between a to z,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
204 |
and contain dots,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
205 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
206 |
46
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
207 |
00:02:19,265 --> 00:02:22,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
208 |
but can only be two
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
209 |
characters long
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
210 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
211 |
47
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
212 |
00:02:22,340 --> 00:02:25,685
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
213 |
up till six characters
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
214 |
and not more.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
215 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
216 |
48
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
217 |
00:02:25,685 --> 00:02:29,240
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
218 |
Then you also have
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
219 |
something like ranges.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
220 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
221 |
49
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
222 |
00:02:29,240 --> 00:02:31,220
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
223 |
So you can see, letters between a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
224 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
225 |
50
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
226 |
00:02:31,220 --> 00:02:33,635
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
227 |
and z and 0 to 9 and so on.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
228 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
229 |
51
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
230 |
00:02:33,635 --> 00:02:36,545
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
231 |
Here you also have regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
232 |
expression which can
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
233 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
234 |
52
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
235 |
00:02:36,545 --> 00:02:40,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
236 |
match something which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
237 |
isn't in this range.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
238 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
239 |
53
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
240 |
00:02:40,070 --> 00:02:42,560
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
241 |
So for example, if
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
242 |
you want for example match,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
243 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
244 |
54
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
245 |
00:02:42,560 --> 00:02:44,030
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
246 |
letters but not numbers,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
247 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
248 |
55
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
249 |
00:02:44,030 --> 00:02:45,800
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
250 |
you would say, well, if
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
251 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
252 |
56
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
253 |
00:02:45,800 --> 00:02:48,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
254 |
this is a number that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
255 |
should not match.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
256 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
257 |
57
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
258 |
00:02:49,090 --> 00:02:52,804
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
259 |
Typically you also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
260 |
have these ranges.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
261 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
262 |
58
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
263 |
00:02:52,804 --> 00:02:55,565
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
264 |
Lowercase letters,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
265 |
capital letters.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
266 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
267 |
59
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
268 |
00:02:55,565 --> 00:02:58,550
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
269 |
Then you have some
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
270 |
special regular expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
271 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
272 |
60
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
273 |
00:02:58,550 --> 00:03:02,195
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
274 |
like this one is only
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
275 |
supposed to match digits.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
276 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
277 |
61
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
278 |
00:03:02,195 --> 00:03:05,674
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
279 |
A dot is supposed to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
280 |
match any character.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
281 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
282 |
62
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
283 |
00:03:05,674 --> 00:03:07,370
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
284 |
And then they have also something
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
285 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
286 |
63
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
287 |
00:03:07,370 --> 00:03:09,800
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
288 |
called groups which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
289 |
is supposed to be
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
290 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
291 |
64
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
292 |
00:03:09,800 --> 00:03:12,799
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
293 |
used when you are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
294 |
trying to extract
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
295 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
296 |
65
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
297 |
00:03:12,799 --> 00:03:15,605
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
298 |
a string you've matched.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
299 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
300 |
66
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
301 |
00:03:15,605 --> 00:03:19,925
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
302 |
Okay, so these are the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
303 |
typical regular expressions.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
304 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
305 |
67
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
306 |
00:03:19,925 --> 00:03:23,075
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
307 |
And here's a particular one.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
308 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
309 |
68
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
310 |
00:03:23,075 --> 00:03:25,820
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
311 |
Trying to match something
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
312 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
313 |
69
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
314 |
00:03:25,820 --> 00:03:28,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
315 |
which resembles
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
316 |
an email address.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
317 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
318 |
70
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
319 |
00:03:29,590 --> 00:03:33,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
320 |
Clearly that should be all easy.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
321 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
322 |
71
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
323 |
00:03:33,065 --> 00:03:36,230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
324 |
And our technology should
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
325 |
be on top of that.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
326 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
327 |
72
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
328 |
00:03:36,230 --> 00:03:37,865
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
329 |
That we can take a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
330 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
331 |
73
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
332 |
00:03:37,865 --> 00:03:41,015
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
333 |
regular expressions and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
334 |
we can take a string,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
335 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
336 |
74
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
337 |
00:03:41,015 --> 00:03:43,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
338 |
and we should have programs to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
339 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
340 |
75
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
341 |
00:03:43,340 --> 00:03:45,680
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
342 |
decide whether this
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
343 |
string is matched
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
344 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
345 |
76
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
346 |
00:03:45,680 --> 00:03:50,330
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
347 |
by a regular expression or
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
348 |
not and should be easy-peasy, no?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
349 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
350 |
77
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
351 |
00:03:50,330 --> 00:03:56,150
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
352 |
Well, let's have a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
353 |
look at two examples.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
354 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
355 |
78
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
356 |
00:03:56,150 --> 00:04:00,860
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
357 |
The first regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
358 |
is a star star b.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
359 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
360 |
79
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
361 |
00:04:00,860 --> 00:04:02,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
362 |
And it is supposed
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
363 |
to match strings of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
364 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
365 |
80
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
366 |
00:04:02,990 --> 00:04:05,825
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
367 |
the form 0 or more a's,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
368 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
369 |
81
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
370 |
00:04:05,825 --> 00:04:10,385
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
371 |
followed by a b. The parentheses
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
372 |
you can ignore.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
373 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
374 |
82
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
375 |
00:04:10,385 --> 00:04:11,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
376 |
And a star star
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
377 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
378 |
83
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
379 |
00:04:11,990 --> 00:04:14,120
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
380 |
also doesn't
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
381 |
make any difference
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
382 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
383 |
84
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
384 |
00:04:14,120 --> 00:04:16,505
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
385 |
to what kind of strings
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
386 |
that can be matched.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
387 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
388 |
85
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
389 |
00:04:16,505 --> 00:04:21,635
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
390 |
It can only make 0 more
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
391 |
a's followed by a b.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
392 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
393 |
86
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
394 |
00:04:21,635 --> 00:04:23,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
395 |
And the other regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
396 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
397 |
87
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
398 |
00:04:23,900 --> 00:04:26,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
399 |
is possibly a character a,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
400 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
401 |
88
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
402 |
00:04:26,990 --> 00:04:32,930
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
403 |
n times, followed by character
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
404 |
a axactly n-times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
405 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
406 |
89
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
407 |
00:04:32,930 --> 00:04:35,570
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
408 |
And we will try out
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
409 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
410 |
90
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
411 |
00:04:35,570 --> 00:04:38,360
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
412 |
these two regular expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
413 |
with strings of the form a,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
414 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
415 |
91
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
416 |
00:04:38,360 --> 00:04:39,890
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
417 |
aa, and so on,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
418 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
419 |
92
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
420 |
00:04:39,890 --> 00:04:45,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
421 |
and up to the length of n. And
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
422 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
423 |
93
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
424 |
00:04:45,770 --> 00:04:49,130
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
425 |
this regular expression should
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
426 |
actually not match any of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
427 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
428 |
94
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
429 |
00:04:49,130 --> 00:04:53,315
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
430 |
the strings because the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
431 |
final b is missing.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
432 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
433 |
95
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
434 |
00:04:53,315 --> 00:04:56,150
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
435 |
But that is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
436 |
okay. For example
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
437 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
438 |
96
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
439 |
00:04:56,150 --> 00:04:57,425
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
440 |
if you have a regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
441 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
442 |
97
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
443 |
00:04:57,425 --> 00:05:00,110
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
444 |
that is supposed to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
445 |
check whether a string is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
446 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
447 |
98
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
448 |
00:05:00,110 --> 00:05:01,490
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
449 |
an email address and the user
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
450 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
451 |
99
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
452 |
00:05:01,490 --> 00:05:03,380
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
453 |
gives some random
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
454 |
strings in there,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
455 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
456 |
100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
457 |
00:05:03,380 --> 00:05:06,545
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
458 |
then this regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
459 |
should not match that string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
460 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
461 |
101
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
462 |
00:05:06,545 --> 00:05:08,420
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
463 |
And for this regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
464 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
465 |
102
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
466 |
00:05:08,420 --> 00:05:11,195
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
467 |
you have to scratch a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
468 |
little bit of your head,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
469 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
470 |
103
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
471 |
00:05:11,195 --> 00:05:12,620
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
472 |
what it can actually match.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
473 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
474 |
104
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
475 |
00:05:12,620 --> 00:05:14,720
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
476 |
But after a little bit
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
477 |
of head scratching,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
478 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
479 |
105
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
480 |
00:05:14,720 --> 00:05:18,260
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
481 |
you find out can match
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
482 |
any string which is of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
483 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
484 |
106
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
485 |
00:05:18,260 --> 00:05:22,580
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
486 |
the length n a's up
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
487 |
to 2n of a's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
488 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
489 |
107
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
490 |
00:05:22,580 --> 00:05:24,290
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
491 |
So anything in this range,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
492 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
493 |
108
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
494 |
00:05:24,290 --> 00:05:27,185
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
495 |
this regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
496 |
can actually match.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
497 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
498 |
109
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
499 |
00:05:27,185 --> 00:05:30,395
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
500 |
Okay, let's
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
501 |
take a random tool,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
502 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
503 |
110
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
504 |
00:05:30,395 --> 00:05:32,630
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
505 |
maybe for example Python.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
506 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
507 |
111
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
508 |
00:05:32,630 --> 00:05:35,240
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
509 |
So here's a little
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
510 |
Python program.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
511 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
512 |
112
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
513 |
00:05:35,240 --> 00:05:38,690
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
514 |
It uses the library
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
515 |
function of Python to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
516 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
517 |
113
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
518 |
00:05:38,690 --> 00:05:42,935
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
519 |
match the regular expressions of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
520 |
a star star b.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
521 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
522 |
114
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
523 |
00:05:42,935 --> 00:05:46,805
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
524 |
And we measure time with longer
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
525 |
and longer strings of a.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
526 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
527 |
115
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
528 |
00:05:46,805 --> 00:05:48,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
529 |
And so conveniently we can give
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
530 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
531 |
116
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
532 |
00:05:48,770 --> 00:05:51,140
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
533 |
the number of a's here
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
534 |
on the command line.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
535 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
536 |
117
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
537 |
00:05:51,140 --> 00:05:56,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
538 |
If I just call
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
539 |
this on the command line,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
540 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
541 |
118
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
542 |
00:05:56,900 --> 00:05:59,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
543 |
Let's say we first
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
544 |
start with five a's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
545 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
546 |
119
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
547 |
00:05:59,900 --> 00:06:03,920
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
548 |
And I get also the times which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
549 |
in this case is next to nothing.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
550 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
551 |
120
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
552 |
00:06:03,920 --> 00:06:05,960
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
553 |
And here's the string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
554 |
we just matched.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
555 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
556 |
121
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
557 |
00:06:05,960 --> 00:06:07,640
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
558 |
And obviously the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
559 |
regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
560 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
561 |
122
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
562 |
00:06:07,640 --> 00:06:09,110
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
563 |
did not match the string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
564 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
565 |
123
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
566 |
00:06:09,110 --> 00:06:11,255
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
567 |
That's indicated by this none.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
568 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
569 |
124
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
570 |
00:06:11,255 --> 00:06:13,925
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
571 |
Let's take ten a's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
572 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
573 |
125
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
574 |
00:06:13,925 --> 00:06:16,490
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
575 |
It's also pretty quick.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
576 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
577 |
126
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
578 |
00:06:16,490 --> 00:06:20,780
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
579 |
Fifteen a's, even quicker,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
580 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
581 |
127
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
582 |
00:06:20,780 --> 00:06:23,180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
583 |
but these times always need to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
584 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
585 |
128
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
586 |
00:06:23,180 --> 00:06:25,820
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
587 |
be taken with a grain of salt.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
588 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
589 |
129
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
590 |
00:06:25,820 --> 00:06:28,040
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
591 |
They are not 100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
592 |
percent accurate.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
593 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
594 |
130
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
595 |
00:06:28,040 --> 00:06:31,490
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
596 |
So 15 is also a let's take
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
597 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
598 |
131
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
599 |
00:06:31,490 --> 00:06:36,965
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
600 |
28th notes already
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
601 |
double the time.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
602 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
603 |
132
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
604 |
00:06:36,965 --> 00:06:42,440
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
605 |
Twenty-five longer.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
606 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
607 |
133
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
608 |
00:06:42,440 --> 00:06:45,680
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
609 |
Okay, that suddenly
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
610 |
from 02 seconds,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
611 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
612 |
134
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
613 |
00:06:45,680 --> 00:06:48,960
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
614 |
it takes almost four seconds.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
615 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
616 |
135
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
617 |
00:06:49,600 --> 00:06:54,890
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
618 |
Six this
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
619 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
620 |
136
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
621 |
00:06:54,890 --> 00:07:01,415
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
622 |
takes six seconds
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
623 |
already Double, okay?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
624 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
625 |
137
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
626 |
00:07:01,415 --> 00:07:07,229
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
627 |
Go to 28. That would be now.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
628 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
629 |
138
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
630 |
00:07:08,890 --> 00:07:11,840
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
631 |
You see the string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
632 |
isn't very long,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
633 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
634 |
139
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
635 |
00:07:11,840 --> 00:07:13,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
636 |
so that could be easily like
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
637 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
638 |
140
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
639 |
00:07:13,340 --> 00:07:16,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
640 |
just the size of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
641 |
an email address.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
642 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
643 |
141
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
644 |
00:07:16,070 --> 00:07:19,280
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
645 |
And the regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
646 |
expression matching
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
647 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
648 |
142
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
649 |
00:07:19,280 --> 00:07:22,550
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
650 |
engine in Python needs
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
651 |
quite a long time
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
652 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
653 |
143
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
654 |
00:07:22,550 --> 00:07:24,710
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
655 |
to find out that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
656 |
this string of 28
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
657 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
658 |
144
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
659 |
00:07:24,710 --> 00:07:26,570
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
660 |
AES is actually not much
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
661 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
662 |
145
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
663 |
00:07:26,570 --> 00:07:28,490
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
664 |
by that you see it's
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
665 |
still not finished.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
666 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
667 |
146
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
668 |
00:07:28,490 --> 00:07:32,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
669 |
I think it should take
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
670 |
approximately like 20 seconds.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
671 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
672 |
147
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
673 |
00:07:32,900 --> 00:07:34,400
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
674 |
Okay. Already 30.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
675 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
676 |
148
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
677 |
00:07:34,400 --> 00:07:36,530
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
678 |
And if we would try
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
679 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
680 |
149
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
681 |
00:07:36,530 --> 00:07:40,805
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
682 |
30 would be already
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
683 |
more than a minute.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
684 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
685 |
150
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
686 |
00:07:40,805 --> 00:07:43,940
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
687 |
And if I could read
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
688 |
something like hundreds,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
689 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
690 |
151
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
691 |
00:07:43,940 --> 00:07:46,220
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
692 |
you remember if a doubling in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
693 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
694 |
152
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
695 |
00:07:46,220 --> 00:07:48,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
696 |
each step or the second step,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
697 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
698 |
153
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
699 |
00:07:48,770 --> 00:07:50,720
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
700 |
the story with the chess board,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
701 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
702 |
154
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
703 |
00:07:50,720 --> 00:07:53,855
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
704 |
we probably would sit here
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
705 |
until the next century.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
706 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
707 |
155
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
708 |
00:07:53,855 --> 00:07:56,820
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
709 |
So something strange here.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
710 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
711 |
156
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
712 |
00:07:57,580 --> 00:08:01,355
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
713 |
Okay, that might be just
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
714 |
a problem of Python.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
715 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
716 |
157
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
717 |
00:08:01,355 --> 00:08:02,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
718 |
Let's have a look at another
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
719 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
720 |
158
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
721 |
00:08:02,990 --> 00:08:04,985
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
722 |
regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
723 |
matching engine.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
724 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
725 |
159
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
726 |
00:08:04,985 --> 00:08:06,890
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
727 |
This time from JavaScript,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
728 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
729 |
160
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
730 |
00:08:06,890 --> 00:08:10,040
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
731 |
also are pretty well-known
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
732 |
programming language.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
733 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
734 |
161
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
735 |
00:08:10,040 --> 00:08:13,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
736 |
So here you can see
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
737 |
it's still a star,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
738 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
739 |
162
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
740 |
00:08:13,610 --> 00:08:16,235
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
741 |
star followed by b,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
742 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
743 |
163
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
744 |
00:08:16,235 --> 00:08:18,920
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
745 |
by direct expression is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
746 |
supposed to match that from
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
747 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
748 |
164
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
749 |
00:08:18,920 --> 00:08:21,830
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
750 |
the beginning of the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
751 |
string up till the end.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
752 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
753 |
165
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
754 |
00:08:21,830 --> 00:08:23,930
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
755 |
So there's not any difference
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
756 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
757 |
166
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
758 |
00:08:23,930 --> 00:08:26,150
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
759 |
in the strings this work
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
760 |
expression matches.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
761 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
762 |
167
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
763 |
00:08:26,150 --> 00:08:28,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
764 |
We'll just start at the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
765 |
beginning of the string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
766 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
767 |
168
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
768 |
00:08:28,610 --> 00:08:31,460
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
769 |
and finish at the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
770 |
end of the string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
771 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
772 |
169
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
773 |
00:08:31,460 --> 00:08:35,285
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
774 |
And we again, we just use
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
775 |
repeated A's for that.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
776 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
777 |
170
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
778 |
00:08:35,285 --> 00:08:38,195
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
779 |
And similarly, we can
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
780 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
781 |
171
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
782 |
00:08:38,195 --> 00:08:41,930
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
783 |
call it on the command line
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
784 |
and can do some timing.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
785 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
786 |
172
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
787 |
00:08:41,930 --> 00:08:44,540
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
788 |
So ten SBA, good.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
789 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
790 |
173
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
791 |
00:08:44,540 --> 00:08:46,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
792 |
Here's the string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
793 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
794 |
174
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
795 |
00:08:46,340 --> 00:08:48,320
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
796 |
It cannot match that string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
797 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
798 |
175
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
799 |
00:08:48,320 --> 00:08:50,525
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
800 |
And it's pretty fast.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
801 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
802 |
176
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
803 |
00:08:50,525 --> 00:08:54,725
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
804 |
Friendly. Although pretty fast.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
805 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
806 |
177
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
807 |
00:08:54,725 --> 00:08:59,120
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
808 |
Five, again,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
809 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
810 |
178
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
811 |
00:08:59,120 --> 00:09:06,650
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
812 |
somehow is kind of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
813 |
threshold that is 25, 26.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
814 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
815 |
179
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
816 |
00:09:06,650 --> 00:09:09,485
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
817 |
Suddenly it takes much longer.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
818 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
819 |
180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
820 |
00:09:09,485 --> 00:09:14,360
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
821 |
And it has essentially the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
822 |
same problem as with Python.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
823 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
824 |
181
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
825 |
00:09:14,360 --> 00:09:17,165
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
826 |
So you'll see in now from 26 on,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
827 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
828 |
182
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
829 |
00:09:17,165 --> 00:09:19,250
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
830 |
the Times has always
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
831 |
doubling from
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
832 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
833 |
183
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
834 |
00:09:19,250 --> 00:09:21,860
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
835 |
three seconds to seven seconds.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
836 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
837 |
184
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
838 |
00:09:21,860 --> 00:09:23,330
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
839 |
So you can imagine what that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
840 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
841 |
185
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
842 |
00:09:23,330 --> 00:09:24,890
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
843 |
roughly takes when I put your
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
844 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
845 |
186
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
846 |
00:09:24,890 --> 00:09:30,230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
847 |
27 and you see the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
848 |
string isn't very long.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
849 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
850 |
187
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
851 |
00:09:30,230 --> 00:09:32,165
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
852 |
Let's choose twenties or maize.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
853 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
854 |
188
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
855 |
00:09:32,165 --> 00:09:35,419
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
856 |
Imagine you have to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
857 |
search a database
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
858 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
859 |
189
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
860 |
00:09:35,419 --> 00:09:38,720
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
861 |
with kilobytes of data.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
862 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
863 |
190
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
864 |
00:09:38,720 --> 00:09:42,260
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
865 |
This, these regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
866 |
expressions that would years
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
867 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
868 |
191
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
869 |
00:09:42,260 --> 00:09:48,150
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
870 |
need years to go through with
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
871 |
these regular expressions.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
872 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
873 |
192
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
874 |
00:09:48,630 --> 00:09:51,850
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
875 |
Okay, maybe the people in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
876 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
877 |
193
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
878 |
00:09:51,850 --> 00:09:55,435
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
879 |
Python and JavaScript,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
880 |
they're just idiots.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
881 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
882 |
194
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
883 |
00:09:55,435 --> 00:09:58,180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
884 |
Surely Java must do much better.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
885 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
886 |
195
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
887 |
00:09:58,180 --> 00:10:01,045
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
888 |
So here's a program.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
889 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
890 |
196
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
891 |
00:10:01,045 --> 00:10:03,415
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
892 |
You can see this again
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
893 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
894 |
197
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
895 |
00:10:03,415 --> 00:10:05,980
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
896 |
is the reg expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
897 |
and we just having
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
898 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
899 |
198
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
900 |
00:10:05,980 --> 00:10:08,320
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
901 |
some scaffolding to generate
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
902 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
903 |
199
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
904 |
00:10:08,320 --> 00:10:11,905
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
905 |
strings from five up till 28.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
906 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
907 |
200
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
908 |
00:10:11,905 --> 00:10:14,305
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
909 |
And if we run that,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
910 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
911 |
201
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
912 |
00:10:14,305 --> 00:10:16,660
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
913 |
actually does that automatically.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
914 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
915 |
202
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
916 |
00:10:16,660 --> 00:10:19,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
917 |
So uphill 19, pretty fast,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
918 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
919 |
203
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
920 |
00:10:19,900 --> 00:10:24,925
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
921 |
but then starting from
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
922 |
23, skidding pretty slow.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
923 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
924 |
204
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
925 |
00:10:24,925 --> 00:10:27,445
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
926 |
So the question is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
927 |
what's going on?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
928 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
929 |
205
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
930 |
00:10:27,445 --> 00:10:29,230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
931 |
By the way, I'm not quoting here.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
932 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
933 |
206
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
934 |
00:10:29,230 --> 00:10:33,755
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
935 |
Scala, using internally
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
936 |
the regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
937 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
938 |
207
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
939 |
00:10:33,755 --> 00:10:36,665
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
940 |
matching engine from Java.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
941 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
942 |
208
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
943 |
00:10:36,665 --> 00:10:39,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
944 |
So would have exactly
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
945 |
the same problem.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
946 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
947 |
209
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
948 |
00:10:39,065 --> 00:10:41,480
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
949 |
Also, I have been
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
950 |
here very careful,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
951 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
952 |
210
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
953 |
00:10:41,480 --> 00:10:43,550
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
954 |
I'm using here Scala aid,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
955 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
956 |
211
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
957 |
00:10:43,550 --> 00:10:46,085
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
958 |
which nowadays is quite old.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
959 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
960 |
212
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
961 |
00:10:46,085 --> 00:10:50,765
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
962 |
But you will see also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
963 |
current Java versions.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
964 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
965 |
213
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
966 |
00:10:50,765 --> 00:10:55,490
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
967 |
We will see we can out-compete
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
968 |
them by magnitudes.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
969 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
970 |
214
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
971 |
00:10:55,490 --> 00:10:57,605
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
972 |
So I think I can that.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
973 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
974 |
215
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
975 |
00:10:57,605 --> 00:10:59,165
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
976 |
Now, just finish here.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
977 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
978 |
216
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
979 |
00:10:59,165 --> 00:11:04,025
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
980 |
You see the problem. Just
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
981 |
for completeness sake.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
982 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
983 |
217
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
984 |
00:11:04,025 --> 00:11:07,010
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
985 |
Here is a Ruby program.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
986 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
987 |
218
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
988 |
00:11:07,010 --> 00:11:09,935
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
989 |
This is using the other
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
990 |
regular expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
991 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
992 |
219
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
993 |
00:11:09,935 --> 00:11:12,935
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
994 |
In this case the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
995 |
string should match.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
996 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
997 |
220
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
998 |
00:11:12,935 --> 00:11:20,300
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
999 |
And again it tries out
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1000 |
strings between 130 here.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1001 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1002 |
221
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1003 |
00:11:20,300 --> 00:11:23,450
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1004 |
That's a program actually
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1005 |
a former student produced.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1006 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1007 |
222
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1008 |
00:11:23,450 --> 00:11:25,565
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1009 |
And you can see four a's
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1010 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1011 |
223
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1012 |
00:11:25,565 --> 00:11:29,780
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1013 |
of links up till 20
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1014 |
AES is pretty fast.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1015 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1016 |
224
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1017 |
00:11:29,780 --> 00:11:32,495
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1018 |
But then starting at 26,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1019 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1020 |
225
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1021 |
00:11:32,495 --> 00:11:35,285
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1022 |
it's getting really slow.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1023 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1024 |
226
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1025 |
00:11:35,285 --> 00:11:37,100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1026 |
So in this case,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1027 |
remember the string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1028 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1029 |
227
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1030 |
00:11:37,100 --> 00:11:38,870
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1031 |
is actually matched by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1032 |
the regular expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1033 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1034 |
228
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1035 |
00:11:38,870 --> 00:11:40,130
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1036 |
So it has nothing to do
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1037 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1038 |
229
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1039 |
00:11:40,130 --> 00:11:41,540
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1040 |
with a regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1041 |
expression actually
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1042 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1043 |
230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1044 |
00:11:41,540 --> 00:11:45,485
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1045 |
matches a string or does
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1046 |
not match a string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1047 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1048 |
231
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1049 |
00:11:45,485 --> 00:11:48,260
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1050 |
I admit though these
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1051 |
regular expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1052 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1053 |
232
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1054 |
00:11:48,260 --> 00:11:49,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1055 |
are carefully chosen,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1056 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1057 |
233
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1058 |
00:11:49,610 --> 00:11:52,250
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1059 |
as you will see later on.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1060 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1061 |
234
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1062 |
00:11:52,250 --> 00:11:55,620
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1063 |
Hey, I also just stop that here.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1064 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1065 |
235
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1066 |
00:11:55,710 --> 00:12:00,985
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1067 |
Okay, this slight collect
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1068 |
this information about times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1069 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1070 |
236
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1071 |
00:12:00,985 --> 00:12:03,400
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1072 |
On the right hand side will
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1073 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1074 |
237
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1075 |
00:12:03,400 --> 00:12:05,860
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1076 |
be our regular expression mantra,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1077 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1078 |
238
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1079 |
00:12:05,860 --> 00:12:08,290
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1080 |
which we implement next week.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1081 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1082 |
239
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1083 |
00:12:08,290 --> 00:12:10,795
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1084 |
On the left-hand side,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1085 |
are these times by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1086 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1087 |
240
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1088 |
00:12:10,795 --> 00:12:14,260
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1089 |
barriers than regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1090 |
expression matching engines?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1091 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1092 |
241
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1093 |
00:12:14,260 --> 00:12:17,809
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1094 |
On the top is this
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1095 |
regular expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1096 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1097 |
242
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1098 |
00:12:19,080 --> 00:12:23,335
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1099 |
Possible a n times a n times.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1100 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1101 |
243
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1102 |
00:12:23,335 --> 00:12:26,890
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1103 |
And on the lowest
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1104 |
is a star, star b.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1105 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1106 |
244
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1107 |
00:12:26,890 --> 00:12:30,370
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1108 |
And the x-axis show here
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1109 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1110 |
245
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1111 |
00:12:30,370 --> 00:12:35,335
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1112 |
the length of the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1113 |
string. How many a's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1114 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1115 |
246
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1116 |
00:12:35,335 --> 00:12:38,925
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1117 |
And on the y axis is the time.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1118 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1119 |
247
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1120 |
00:12:38,925 --> 00:12:41,660
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1121 |
They need to decide whether
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1122 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1123 |
248
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1124 |
00:12:41,660 --> 00:12:44,615
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1125 |
the string is matched by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1126 |
the rate expression or not.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1127 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1128 |
249
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1129 |
00:12:44,615 --> 00:12:46,415
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1130 |
So you can see here, Python,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1131 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1132 |
250
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1133 |
00:12:46,415 --> 00:12:47,945
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1134 |
Java eight in JavaScript,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1135 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1136 |
251
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1137 |
00:12:47,945 --> 00:12:52,250
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1138 |
they max out approximately
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1139 |
at between 2530.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1140 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1141 |
252
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1142 |
00:12:52,250 --> 00:12:53,900
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1143 |
The kristin, it takes already
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1144 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1145 |
253
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1146 |
00:12:53,900 --> 00:12:55,160
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1147 |
a half a minute to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1148 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1149 |
254
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1150 |
00:12:55,160 --> 00:12:57,410
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1151 |
decide whether the string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1152 |
is matched or not.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1153 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1154 |
255
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1155 |
00:12:57,410 --> 00:13:00,815
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1156 |
And similarly, in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1157 |
the other example,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1158 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1159 |
256
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1160 |
00:13:00,815 --> 00:13:03,830
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1161 |
Python and derived Ruby max out
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1162 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1163 |
257
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1164 |
00:13:03,830 --> 00:13:07,220
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1165 |
at a similar kind of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1166 |
length of the strings.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1167 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1168 |
258
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1169 |
00:13:07,220 --> 00:13:10,400
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1170 |
Because then they use also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1171 |
half a minute to decide
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1172 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1173 |
259
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1174 |
00:13:10,400 --> 00:13:13,940
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1175 |
whether this rec expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1176 |
actually matches the string.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1177 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1178 |
260
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1179 |
00:13:13,940 --> 00:13:16,790
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1180 |
Contrast that with
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1181 |
the reg expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1182 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1183 |
261
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1184 |
00:13:16,790 --> 00:13:19,235
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1185 |
which we are regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1186 |
expression mantra,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1187 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1188 |
262
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1189 |
00:13:19,235 --> 00:13:21,470
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1190 |
which we're going to implement.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1191 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1192 |
263
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1193 |
00:13:21,470 --> 00:13:25,040
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1194 |
This can match
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1195 |
approximately 10 thousand
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1196 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1197 |
264
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1198 |
00:13:25,040 --> 00:13:30,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1199 |
a's in this example and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1200 |
needs less than ten seconds.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1201 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1202 |
265
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1203 |
00:13:30,065 --> 00:13:32,285
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1204 |
Actually, there will be
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1205 |
two versions of that.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1206 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1207 |
266
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1208 |
00:13:32,285 --> 00:13:34,850
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1209 |
First version may be
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1210 |
also relatively slow.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1211 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1212 |
267
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1213 |
00:13:34,850 --> 00:13:36,410
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1214 |
But the second version,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1215 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1216 |
268
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1217 |
00:13:36,410 --> 00:13:38,240
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1218 |
in contrast to Python,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1219 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1220 |
269
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1221 |
00:13:38,240 --> 00:13:40,295
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1222 |
Ruby, we'll be blindingly fast.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1223 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1224 |
270
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1225 |
00:13:40,295 --> 00:13:42,380
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1226 |
And in the second example,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1227 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1228 |
271
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1229 |
00:13:42,380 --> 00:13:45,740
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1230 |
you have to be careful
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1231 |
about the x axis because
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1232 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1233 |
272
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1234 |
00:13:45,740 --> 00:13:49,385
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1235 |
that means four times
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1236 |
ten to the power six.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1237 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1238 |
273
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1239 |
00:13:49,385 --> 00:13:51,695
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1240 |
It's actually 4 million A's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1241 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1242 |
274
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1243 |
00:13:51,695 --> 00:13:55,100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1244 |
So our regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1245 |
expression match or need
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1246 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1247 |
275
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1248 |
00:13:55,100 --> 00:13:57,635
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1249 |
less than ten seconds to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1250 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1251 |
276
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1252 |
00:13:57,635 --> 00:14:00,725
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1253 |
match a string of length
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1254 |
of 4 million A's.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1255 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1256 |
277
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1257 |
00:14:00,725 --> 00:14:04,430
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1258 |
Contrast that Python, Java eight,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1259 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1260 |
278
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1261 |
00:14:04,430 --> 00:14:06,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1262 |
and JavaScript need half a minute
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1263 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1264 |
279
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1265 |
00:14:06,770 --> 00:14:09,905
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1266 |
already for a string
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1267 |
of length just 30,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1268 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1269 |
280
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1270 |
00:14:09,905 --> 00:14:12,365
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1271 |
unless you're very
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1272 |
careful with Java eight.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1273 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1274 |
281
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1275 |
00:14:12,365 --> 00:14:15,725
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1276 |
Yes, Java nine and above,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1277 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1278 |
282
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1279 |
00:14:15,725 --> 00:14:17,180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1280 |
they already have
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1281 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1282 |
283
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1283 |
00:14:17,180 --> 00:14:19,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1284 |
a much better regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1285 |
expression matching engine,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1286 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1287 |
284
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1288 |
00:14:19,610 --> 00:14:22,805
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1289 |
but still we will be running
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1290 |
circles around them.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1291 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1292 |
285
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1293 |
00:14:22,805 --> 00:14:27,050
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1294 |
It's this data. I
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1295 |
call this slide.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1296 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1297 |
286
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1298 |
00:14:27,050 --> 00:14:29,675
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1299 |
Why bother with
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1300 |
regular expressions?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1301 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1302 |
287
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1303 |
00:14:29,675 --> 00:14:33,515
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1304 |
But you can probably
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1305 |
see these are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1306 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1307 |
288
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1308 |
00:14:33,515 --> 00:14:34,910
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1309 |
at least more times by
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1310 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1311 |
289
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1312 |
00:14:34,910 --> 00:14:38,015
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1313 |
the existing regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1314 |
expression matching engines.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1315 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1316 |
290
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1317 |
00:14:38,015 --> 00:14:40,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1318 |
And it's actually
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1319 |
surprising that after
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1320 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1321 |
291
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1322 |
00:14:40,070 --> 00:14:42,695
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1323 |
one lecture we can already
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1324 |
do substantially better.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1325 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1326 |
292
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1327 |
00:14:42,695 --> 00:14:47,495
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1328 |
And if you don't believe
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1329 |
in D times, I gave here,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1330 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1331 |
293
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1332 |
00:14:47,495 --> 00:14:50,090
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1333 |
please feel free to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1334 |
play on your own
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1335 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1336 |
294
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1337 |
00:14:50,090 --> 00:14:52,865
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1338 |
with the examples
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1339 |
I uploaded, Keats.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1340 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1341 |
295
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1342 |
00:14:52,865 --> 00:14:55,235
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1343 |
These are exactly the programs
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1344 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1345 |
296
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1346 |
00:14:55,235 --> 00:14:57,470
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1347 |
are used here in the examples.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1348 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1349 |
297
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1350 |
00:14:57,470 --> 00:14:59,255
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1351 |
So feel free.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1352 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1353 |
298
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1354 |
00:14:59,255 --> 00:15:01,970
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1355 |
You might however now think, hmm.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1356 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1357 |
299
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1358 |
00:15:01,970 --> 00:15:05,449
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1359 |
These are two very
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1360 |
well chosen examples.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1361 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1362 |
300
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1363 |
00:15:05,449 --> 00:15:07,145
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1364 |
And I admit that's true.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1365 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1366 |
301
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1367 |
00:15:07,145 --> 00:15:09,410
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1368 |
And such problem there never
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1369 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1370 |
302
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1371 |
00:15:09,410 --> 00:15:12,540
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1372 |
causing any problems
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1373 |
in real life.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1374 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1375 |
303
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1376 |
00:15:13,300 --> 00:15:15,980
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1377 |
Regular expressions are used very
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1378 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1379 |
304
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1380 |
00:15:15,980 --> 00:15:19,415
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1381 |
frequently and they
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1382 |
do cause problems.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1383 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1384 |
305
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1385 |
00:15:19,415 --> 00:15:21,410
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1386 |
So here's my first example from
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1387 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1388 |
306
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1389 |
00:15:21,410 --> 00:15:23,885
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1390 |
a company called cloudflare.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1391 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1392 |
307
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1393 |
00:15:23,885 --> 00:15:27,560
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1394 |
This is a huge hosting company
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1395 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1396 |
308
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1397 |
00:15:27,560 --> 00:15:30,935
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1398 |
which host very
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1399 |
well-known web pages.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1400 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1401 |
309
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1402 |
00:15:30,935 --> 00:15:34,970
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1403 |
And they really try hard
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1404 |
to have no outage at all.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1405 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1406 |
310
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1407 |
00:15:34,970 --> 00:15:37,340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1408 |
And they manage
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1409 |
that for six years.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1410 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1411 |
311
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1412 |
00:15:37,340 --> 00:15:39,320
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1413 |
But then a Rekha expression,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1414 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1415 |
312
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1416 |
00:15:39,320 --> 00:15:41,180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1417 |
actually this one caused
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1418 |
a problem and you
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1419 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1420 |
313
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1421 |
00:15:41,180 --> 00:15:43,265
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1422 |
can see they're also
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1423 |
like two stars.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1424 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1425 |
314
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1426 |
00:15:43,265 --> 00:15:44,630
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1427 |
They are at the end.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1428 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1429 |
315
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1430 |
00:15:44,630 --> 00:15:46,955
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1431 |
And because of that string needed
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1432 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1433 |
316
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1434 |
00:15:46,955 --> 00:15:49,865
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1435 |
too much time to be matched.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1436 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1437 |
317
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1438 |
00:15:49,865 --> 00:15:50,990
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1439 |
And because of that,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1440 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1441 |
318
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1442 |
00:15:50,990 --> 00:15:52,430
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1443 |
they had some outage for,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1444 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1445 |
319
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1446 |
00:15:52,430 --> 00:15:54,125
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1447 |
I think several hours,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1448 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1449 |
320
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1450 |
00:15:54,125 --> 00:15:57,920
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1451 |
actually in their malware
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1452 |
detection subsystem.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1453 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1454 |
321
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1455 |
00:15:57,920 --> 00:16:02,060
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1456 |
And the second example
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1457 |
comes from 2016,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1458 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1459 |
322
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1460 |
00:16:02,060 --> 00:16:04,040
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1461 |
where Stack Exchange,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1462 |
I guess you know
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1463 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1464 |
323
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1465 |
00:16:04,040 --> 00:16:06,650
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1466 |
this webpage had
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1467 |
also an outage from,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1468 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1469 |
324
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1470 |
00:16:06,650 --> 00:16:08,390
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1471 |
I think at least an hour.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1472 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1473 |
325
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1474 |
00:16:08,390 --> 00:16:13,070
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1475 |
Because a regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1476 |
then needed to format posts,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1477 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1478 |
326
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1479 |
00:16:13,070 --> 00:16:15,575
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1480 |
needed too much time to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1481 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1482 |
327
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1483 |
00:16:15,575 --> 00:16:19,010
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1484 |
recognize whether this post
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1485 |
should be accepted or not.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1486 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1487 |
328
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1488 |
00:16:19,010 --> 00:16:23,390
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1489 |
And again, there was a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1490 |
semi kind of problem.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1491 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1492 |
329
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1493 |
00:16:23,390 --> 00:16:24,950
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1494 |
And you can read
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1495 |
the stories behind
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1496 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1497 |
330
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1498 |
00:16:24,950 --> 00:16:28,080
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1499 |
that on these two given links.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1500 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1501 |
331
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1502 |
00:16:28,720 --> 00:16:31,730
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1503 |
When I looked at
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1504 |
this the first time,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1505 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1506 |
332
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1507 |
00:16:31,730 --> 00:16:34,175
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1508 |
what surprised me is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1509 |
that theoretician
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1510 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1511 |
333
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1512 |
00:16:34,175 --> 00:16:37,520
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1513 |
who sometimes dedicate their
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1514 |
life to regular expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1515 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1516 |
334
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1517 |
00:16:37,520 --> 00:16:39,440
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1518 |
And no really a lot about
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1519 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1520 |
335
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1521 |
00:16:39,440 --> 00:16:41,690
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1522 |
them didn't know
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1523 |
anything about this.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1524 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1525 |
336
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1526 |
00:16:41,690 --> 00:16:43,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1527 |
But engineers, they
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1528 |
already created
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1529 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1530 |
337
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1531 |
00:16:43,610 --> 00:16:46,160
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1532 |
a name for that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1533 |
regular expression,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1534 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1535 |
338
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1536 |
00:16:46,160 --> 00:16:47,975
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1537 |
denial of service attack.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1538 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1539 |
339
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1540 |
00:16:47,975 --> 00:16:49,745
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1541 |
Because what you can,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1542 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1543 |
340
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1544 |
00:16:49,745 --> 00:16:51,230
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1545 |
what can happen now is that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1546 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1547 |
341
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1548 |
00:16:51,230 --> 00:16:54,920
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1549 |
attackers look for
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1550 |
certain strings.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1551 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1552 |
342
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1553 |
00:16:54,920 --> 00:16:56,780
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1554 |
You make your regular expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1555 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1556 |
343
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1557 |
00:16:56,780 --> 00:16:59,105
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1558 |
matching engine topple over.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1559 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1560 |
344
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1561 |
00:16:59,105 --> 00:17:01,370
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1562 |
And these kind of expressions,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1563 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1564 |
345
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1565 |
00:17:01,370 --> 00:17:04,160
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1566 |
regular expressions called
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1567 |
Eve of reg expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1568 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1569 |
346
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1570 |
00:17:04,160 --> 00:17:06,350
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1571 |
And actually there are
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1572 |
quite a number of them.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1573 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1574 |
347
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1575 |
00:17:06,350 --> 00:17:08,495
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1576 |
So you seen this one,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1577 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1578 |
348
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1579 |
00:17:08,495 --> 00:17:11,255
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1580 |
the first one, and the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1581 |
second one already.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1582 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1583 |
349
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1584 |
00:17:11,255 --> 00:17:13,400
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1585 |
But there are many, many more.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1586 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1587 |
350
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1588 |
00:17:13,400 --> 00:17:15,620
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1589 |
And you can easily have in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1590 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1591 |
351
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1592 |
00:17:15,620 --> 00:17:18,560
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1593 |
your program one of
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1594 |
these reg expression.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1595 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1596 |
352
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1597 |
00:17:18,560 --> 00:17:21,830
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1598 |
And then you have the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1599 |
problem that if you do have
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1600 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1601 |
353
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1602 |
00:17:21,830 --> 00:17:23,240
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1603 |
this regular expression and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1604 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1605 |
354
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1606 |
00:17:23,240 --> 00:17:25,640
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1607 |
somebody finds the
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1608 |
corresponding string,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1609 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1610 |
355
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1611 |
00:17:25,640 --> 00:17:29,945
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1612 |
which make the records
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1613 |
matching engine topple over,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1614 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1615 |
356
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1616 |
00:17:29,945 --> 00:17:31,820
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1617 |
then you have a problem
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1618 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1619 |
357
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1620 |
00:17:31,820 --> 00:17:34,295
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1621 |
because your webpage is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1622 |
probably not variable.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1623 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1624 |
358
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1625 |
00:17:34,295 --> 00:17:36,140
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1626 |
This is also sometimes called
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1627 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1628 |
359
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1629 |
00:17:36,140 --> 00:17:39,350
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1630 |
this phenomenon,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1631 |
catastrophic backtracking.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1632 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1633 |
360
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1634 |
00:17:39,350 --> 00:17:43,595
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1635 |
In lecture three, we will
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1636 |
look at this more carefully.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1637 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1638 |
361
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1639 |
00:17:43,595 --> 00:17:46,910
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1640 |
And actually why that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1641 |
is such a problem in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1642 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1643 |
362
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1644 |
00:17:46,910 --> 00:17:50,795
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1645 |
real life is actually
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1646 |
not to do with Lexus.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1647 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1648 |
363
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1649 |
00:17:50,795 --> 00:17:53,180
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1650 |
Yes, regular
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1651 |
expressions are used as
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1652 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1653 |
364
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1654 |
00:17:53,180 --> 00:17:55,040
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1655 |
the basic tool for implementing
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1656 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1657 |
365
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1658 |
00:17:55,040 --> 00:17:57,185
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1659 |
like source bad reg expressions,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1660 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1661 |
366
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1662 |
00:17:57,185 --> 00:18:00,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1663 |
of course, used in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1664 |
a much wider area.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1665 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1666 |
367
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1667 |
00:18:00,065 --> 00:18:03,770
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1668 |
And they especially used for
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1669 |
network intrusion detection.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1670 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1671 |
368
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1672 |
00:18:03,770 --> 00:18:06,590
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1673 |
Remember, you having to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1674 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1675 |
369
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1676 |
00:18:06,590 --> 00:18:10,130
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1677 |
administer a big network
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1678 |
and you only want to let
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1679 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1680 |
370
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1681 |
00:18:10,130 --> 00:18:13,640
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1682 |
in packets which you think are K
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1683 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1684 |
371
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1685 |
00:18:13,640 --> 00:18:14,930
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1686 |
and you want to keep out
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1687 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1688 |
372
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1689 |
00:18:14,930 --> 00:18:17,645
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1690 |
any package which might
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1691 |
hack into your network.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1692 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1693 |
373
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1694 |
00:18:17,645 --> 00:18:22,670
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1695 |
So what they have is they
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1696 |
have suites of thousands and
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1697 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1698 |
374
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1699 |
00:18:22,670 --> 00:18:25,745
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1700 |
sometimes even more
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1701 |
regular expressions which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1702 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1703 |
375
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1704 |
00:18:25,745 --> 00:18:27,755
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1705 |
check whether this package
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1706 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1707 |
376
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1708 |
00:18:27,755 --> 00:18:30,065
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1709 |
satisfies some patterns or not.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1710 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1711 |
377
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1712 |
00:18:30,065 --> 00:18:31,460
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1713 |
And in this case it will be left
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1714 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1715 |
378
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1716 |
00:18:31,460 --> 00:18:34,205
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1717 |
out or it will be let in.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1718 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1719 |
379
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1720 |
00:18:34,205 --> 00:18:36,335
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1721 |
And with networks,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1722 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1723 |
380
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1724 |
00:18:36,335 --> 00:18:39,080
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1725 |
the problem is that our
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1726 |
hardware is already
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1727 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1728 |
381
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1729 |
00:18:39,080 --> 00:18:43,190
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1730 |
so fast that the reg expressions
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1731 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1732 |
382
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1733 |
00:18:43,190 --> 00:18:45,169
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1734 |
really become a bottleneck.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1735 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1736 |
383
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1737 |
00:18:45,169 --> 00:18:47,060
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1738 |
Because what do you do if now is
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1739 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1740 |
384
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1741 |
00:18:47,060 --> 00:18:49,880
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1742 |
suddenly a reg expression
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1743 |
takes too much time
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1744 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1745 |
385
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1746 |
00:18:49,880 --> 00:18:52,670
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1747 |
to just stop the matching
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1748 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1749 |
386
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1750 |
00:18:52,670 --> 00:18:55,100
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1751 |
and let the package
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1752 |
in regardless?
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1753 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1754 |
387
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1755 |
00:18:55,100 --> 00:18:58,190
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1756 |
Or do you just hold
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1757 |
the network up
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1758 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1759 |
388
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1760 |
00:18:58,190 --> 00:19:01,715
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1761 |
and don't let anything in
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1762 |
until you decided that.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1763 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1764 |
389
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1765 |
00:19:01,715 --> 00:19:04,895
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1766 |
So that's actually a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1767 |
really hard problem.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1768 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1769 |
390
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1770 |
00:19:04,895 --> 00:19:06,650
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1771 |
But the first time I came across
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1772 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1773 |
391
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1774 |
00:19:06,650 --> 00:19:09,965
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1775 |
that problem was actually
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1776 |
by this engineer.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1777 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1778 |
392
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1779 |
00:19:09,965 --> 00:19:13,820
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1780 |
And it's always say that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1781 |
Germans don't have any Yammer.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1782 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1783 |
393
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1784 |
00:19:13,820 --> 00:19:16,985
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1785 |
But I found that
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1786 |
video quite funny.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1787 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1788 |
394
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1789 |
00:19:16,985 --> 00:19:19,145
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1790 |
Maybe you have a
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1791 |
different opinion,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1792 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1793 |
395
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1794 |
00:19:19,145 --> 00:19:21,095
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1795 |
but feel free to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1796 |
have a look which
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1797 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1798 |
396
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1799 |
00:19:21,095 --> 00:19:23,705
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1800 |
explains exactly that problem.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1801 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1802 |
397
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1803 |
00:19:23,705 --> 00:19:25,610
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1804 |
So in the next video,
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1805 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1806 |
398
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1807 |
00:19:25,610 --> 00:19:28,445
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1808 |
we will start to
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1809 |
implement this matcher.
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1810 |
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1811 |
399
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1812 |
00:19:28,445 --> 00:19:30,870
|
Christian Urban <christian.urban@kcl.ac.uk>
parents:
diff
changeset
|
1813 |
So I hope to see you there.
|