| 763 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      1 | 1
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      2 | 00:00:06,710 --> 00:00:09,225
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      3 | Thanks for tuning in again.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      4 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      5 | 2
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      6 | 00:00:09,225 --> 00:00:11,640
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      7 | In this video, we want to specify
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      8 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |      9 | 3
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     10 | 00:00:11,640 --> 00:00:14,370
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     11 | what problem our regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     12 | expression matcher
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     13 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     14 | 4
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     15 | 00:00:14,370 --> 00:00:16,155
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     16 | is actually supposed to solve.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     17 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     18 | 5
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     19 | 00:00:16,155 --> 00:00:18,900
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     20 | The reason is that
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     21 | we know that some of
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     22 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     23 | 6
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     24 | 00:00:18,900 --> 00:00:21,585
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     25 | the existing regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     26 | expression matching engines
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     27 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     28 | 7
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     29 | 00:00:21,585 --> 00:00:25,200
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     30 | are not just abysmally
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     31 | slow in some examples,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     32 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     33 | 8
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     34 | 00:00:25,200 --> 00:00:27,105
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     35 | as you've seen in the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     36 | previous video,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     37 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     38 | 9
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     39 | 00:00:27,105 --> 00:00:30,570
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     40 | but also produce sometimes
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     41 | incorrect results.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     42 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     43 | 10
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     44 | 00:00:30,570 --> 00:00:33,330
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     45 | In order to avoid
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     46 | this with our matcher,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     47 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     48 | 11
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     49 | 00:00:33,330 --> 00:00:35,325
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     50 | we need to somehow explain
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     51 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     52 | 12
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     53 | 00:00:35,325 --> 00:00:39,255
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     54 | precisely what is the problem
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     55 | our algorithm solves.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     56 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     57 | 13
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     58 | 00:00:39,255 --> 00:00:41,935
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     59 | This will require
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     60 | a bit of theory, but
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     61 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     62 | 14
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     63 | 00:00:41,935 --> 00:00:45,335
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     64 | I hope it is nevertheless
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     65 | a bit of fun.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     66 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     67 | 15
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     68 | 00:00:45,335 --> 00:00:47,915
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     69 | First, we have to specify
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     70 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     71 | 16
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     72 | 00:00:47,915 --> 00:00:50,585
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     73 | what we mean by a
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     74 | regular expression.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     75 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     76 | 17
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     77 | 00:00:50,585 --> 00:00:53,210
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     78 | You've seen earlier some
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     79 | examples. They were
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     80 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     81 | 18
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     82 | 00:00:53,210 --> 00:00:56,060
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     83 | actually taken or
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     84 | inspired by what
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     85 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     86 | 19
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     87 | 00:00:56,060 --> 00:00:58,850
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     88 | is available in standard
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     89 | regular expression matching
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     90 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     91 | 20
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     92 | 00:00:58,850 --> 00:01:02,330
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     93 | engines, like star, plus and n-times.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     94 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     95 | 21
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     96 | 00:01:02,330 --> 00:01:05,690
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     97 | But for many tasks,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     98 | for our algorithm,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |     99 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    100 | 22
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    101 | 00:01:05,690 --> 00:01:10,174
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    102 | we will focus only what I call
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    103 | basic regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    104 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    105 | 23
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    106 | 00:01:10,174 --> 00:01:11,840
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    107 | Since I'm lazy, I will call
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    108 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    109 | 24
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    110 | 00:01:11,840 --> 00:01:13,550
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    111 | these basic regular expressions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    112 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    113 | 25
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    114 | 00:01:13,550 --> 00:01:15,485
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    115 | just as regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    116 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    117 | 26
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    118 | 00:01:15,485 --> 00:01:17,405
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    119 | And the ones you've seen earlier
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    120 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    121 | 27
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    122 | 00:01:17,405 --> 00:01:19,400
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    123 | as extended regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    124 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    125 | 28
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    126 | 00:01:19,400 --> 00:01:22,940
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    127 | So the basic regulare expressions,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    128 | or just regular expressions,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    129 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    130 | 29
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    131 | 00:01:22,940 --> 00:01:25,280
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    132 | they will have characters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    133 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    134 | 30
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    135 | 00:01:25,280 --> 00:01:27,170
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    136 | So you can match any character,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    137 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    138 | 31
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    139 | 00:01:27,170 --> 00:01:31,370
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    140 | a,b,c to z or 0 to 9.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    141 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    142 | 32
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    143 | 00:01:31,370 --> 00:01:35,525
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    144 | Any Ascii character. 'c' here
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    145 | is just a representative.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    146 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    147 | 33
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    148 | 00:01:35,525 --> 00:01:38,825
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    149 | So we can match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    150 | single characters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    151 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    152 | 34
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    153 | 00:01:38,825 --> 00:01:42,440
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    154 | Then we can match alternatives.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    155 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    156 | 35
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    157 | 00:01:42,440 --> 00:01:44,930
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    158 | That means a string
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    159 | is either matched
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    160 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    161 | 36
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    162 | 00:01:44,930 --> 00:01:46,730
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    163 | by the regular expression r1
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    164 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    165 | 37
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    166 | 00:01:46,730 --> 00:01:49,324
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    167 | or by the regular expression r2.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    168 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    169 | 38
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    170 | 00:01:49,324 --> 00:01:52,790
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    171 | And for the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    172 | alternative we write +.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    173 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    174 | 39
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    175 | 00:01:52,790 --> 00:01:55,175
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    176 | Then we also have sequence.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    177 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    178 | 40
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    179 | 00:01:55,175 --> 00:01:57,410
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    180 | This sequence regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    181 | expression essentially
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    182 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    183 | 41
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    184 | 00:01:57,410 --> 00:01:59,915
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    185 | says that a string needs to be matched
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    186 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    187 | 42
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    188 | 00:01:59,915 --> 00:02:02,210
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    189 | the first part by
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    190 | the regular expression r1
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    191 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    192 | 43
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    193 | 00:02:02,210 --> 00:02:06,275
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    194 | and then the second
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    195 | part by the r2.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    196 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    197 | 44
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    198 | 00:02:06,275 --> 00:02:10,190
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    199 | And then we have also the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    200 | star regular expression,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    201 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    202 | 45
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    203 | 00:02:10,190 --> 00:02:12,980
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    204 | which says the regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    205 | expression needs to match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    206 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    207 | 46
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    208 | 00:02:12,980 --> 00:02:16,520
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    209 | the string with zero
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    210 | or more copies.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    211 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    212 | 47
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    213 | 00:02:16,520 --> 00:02:18,140
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    214 | And then we also have some
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    215 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    216 | 48
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    217 | 00:02:18,140 --> 00:02:20,060
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    218 | slightly strange
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    219 | regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    220 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    221 | 49
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    222 | 00:02:20,060 --> 00:02:22,505
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    223 | We have the regular expression 1,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    224 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    225 | 50
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    226 | 00:02:22,505 --> 00:02:25,910
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    227 | which can only match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    228 | the empty string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    229 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    230 | 51
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    231 | 00:02:25,910 --> 00:02:29,075
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    232 | I'm using here the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    233 | notation 1 for that
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    234 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    235 | 52
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    236 | 00:02:29,075 --> 00:02:31,340
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    237 | and in my writing I will always
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    238 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    239 | 53
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    240 | 00:02:31,340 --> 00:02:33,440
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    241 | make sure that for the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    242 | regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    243 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    244 | 54
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    245 | 00:02:33,440 --> 00:02:35,765
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    246 | I will write the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    247 | 1 in a bold font.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    248 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    249 | 55
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    250 | 00:02:35,765 --> 00:02:38,510
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    251 | So whenever you see
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    252 | a 1 in bold font,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    253 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    254 | 56
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    255 | 00:02:38,510 --> 00:02:40,395
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    256 | this is not the 1, but
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    257 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    258 | 57
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    259 | 00:02:40,395 --> 00:02:44,300
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    260 | the regular expression which
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    261 | can match the empty string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    262 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    263 | 58
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    264 | 00:02:44,300 --> 00:02:48,050
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    265 | And we also have the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    266 | regular expression 0,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    267 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    268 | 59
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    269 | 00:02:48,050 --> 00:02:50,315
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    270 | which cannot match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    271 | anything at all.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    272 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    273 | 60
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    274 | 00:02:50,315 --> 00:02:51,695
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    275 | You might think, well,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    276 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    277 | 61
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    278 | 00:02:51,695 --> 00:02:54,635
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    279 | that's not much use if it cannot
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    280 | match anything at all,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    281 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    282 | 62
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    283 | 00:02:54,635 --> 00:02:58,130
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    284 | but you will see why that
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    285 | one is important later on.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    286 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    287 | 63
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    288 | 00:02:58,130 --> 00:03:00,785
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    289 | So our basic regular expressions,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    290 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    291 | 64
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    292 | 00:03:00,785 --> 00:03:02,375
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    293 | they will be 0,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    294 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    295 | 65
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    296 | 00:03:02,375 --> 00:03:08,390
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    297 | 1, characters, alternatives,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    298 | sequences and stars.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    299 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    300 | 66
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    301 | 00:03:08,390 --> 00:03:12,170
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    302 | And these are all the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    303 | basic regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    304 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    305 | 67
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    306 | 00:03:12,170 --> 00:03:16,280
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    307 | If this definition is a
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    308 | bit too abstract for you,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    309 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    310 | 68
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    311 | 00:03:16,280 --> 00:03:18,560
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    312 | we can also look at
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    313 | the concrete code,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    314 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    315 | 69
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    316 | 00:03:18,560 --> 00:03:23,060
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    317 | how that would pan out when
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    318 | actually writing some Scala.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    319 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    320 | 70
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    321 | 00:03:23,060 --> 00:03:28,040
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    322 | I promised you, I show
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    323 | you always my code in Scala.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    324 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    325 | 71
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    326 | 00:03:28,040 --> 00:03:29,480
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    327 | So here you would have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    328 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    329 | 72
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    330 | 00:03:29,480 --> 00:03:32,885
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    331 | first an abstract class
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    332 | for regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    333 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    334 | 73
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    335 | 00:03:32,885 --> 00:03:37,580
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    336 | Then you have one regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    337 | expression for 0, 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    338 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    339 | 74
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    340 | 00:03:37,580 --> 00:03:41,540
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    341 | one regular expression for 1, 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    342 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    343 | 75
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    344 | 00:03:41,540 --> 00:03:42,875
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    345 | one regular expression, which
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    346 | takes an argument,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    347 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    348 | 76
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    349 | 00:03:42,875 --> 00:03:45,050
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    350 | the character you want to match,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    351 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    352 | 77
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    353 | 00:03:45,050 --> 00:03:47,915
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    354 | the characters a,b, c and so on.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    355 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    356 | 78
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    357 | 00:03:47,915 --> 00:03:50,945
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    358 | Then we have an alternative
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    359 | regular expression,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    360 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    361 | 79
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    362 | 00:03:50,945 --> 00:03:53,480
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    363 | which takes the first
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    364 | alternative and
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    365 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    366 | 80
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    367 | 00:03:53,480 --> 00:03:56,435
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    368 | the second alternative
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    369 | as arguments.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    370 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    371 | 81
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    372 | 00:03:56,435 --> 00:03:59,690
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    373 | And we have a sequence
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    374 | regular expression. Again,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    375 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    376 | 82
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    377 | 00:03:59,690 --> 00:04:01,850
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    378 | which takes the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    379 | first component and
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    380 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    381 | 83
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    382 | 00:04:01,850 --> 00:04:04,730
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    383 | the second component
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    384 | as two arguments.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    385 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    386 | 84
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    387 | 00:04:04,730 --> 00:04:07,249
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    388 | And we have the star
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    389 | regular expression,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    390 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    391 | 85
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    392 | 00:04:07,249 --> 00:04:10,880
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    393 | which just take one regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    394 | expression as argument.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    395 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    396 | 86
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    397 | 00:04:10,880 --> 00:04:16,115
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    398 | And all these reg expressions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    399 | extend our abstract class.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    400 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    401 | 87
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    402 | 00:04:16,115 --> 00:04:20,300
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    403 | For whatever I do in
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    404 | this module here I have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    405 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    406 | 88
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    407 | 00:04:20,300 --> 00:04:23,300
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    408 | the convention that all
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    409 | the regular expressions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    410 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    411 | 89
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    412 | 00:04:23,300 --> 00:04:25,550
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    413 | are written with capital letters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    414 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    415 | 90
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    416 | 00:04:25,550 --> 00:04:26,885
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    417 | As you can see that here,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    418 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    419 | 91
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    420 | 00:04:26,885 --> 00:04:31,685
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    421 | O, 1,  character, these will be
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    422 | always regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    423 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    424 | 92
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    425 | 00:04:31,685 --> 00:04:34,370
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    426 | They have all capital letters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    427 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    428 | 93
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    429 | 00:04:34,370 --> 00:04:36,484
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    430 | Let's for a moment,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    431 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    432 | 94
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    433 | 00:04:36,484 --> 00:04:38,720
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    434 | play around with this definition.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    435 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    436 | 95
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    437 | 00:04:38,720 --> 00:04:41,945
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    438 | I'm using here the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    439 | Ammonite REPL.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    440 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    441 | 96
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    442 | 00:04:41,945 --> 00:04:46,950
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    443 | And I can evaluate
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    444 | this definition.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    445 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    446 | 97
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    447 | 00:04:53,430 --> 00:04:55,810
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    448 | And now I can start to
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    449 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    450 | 98
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    451 | 00:04:55,810 --> 00:04:58,570
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    452 | define particular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    453 | regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    454 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    455 | 99
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    456 | 00:04:58,570 --> 00:05:00,340
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    457 | For example, if I need
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    458 | a regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    459 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    460 | 100
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    461 | 00:05:00,340 --> 00:05:02,860
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    462 | which can recognise
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    463 | the character a,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    464 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    465 | 101
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    466 | 00:05:02,860 --> 00:05:06,025
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    467 | then I would write
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    468 | something like this.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    469 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    470 | 102
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    471 | 00:05:06,025 --> 00:05:08,710
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    472 | So this regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    473 | takes an argument,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    474 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    475 | 103
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    476 | 00:05:08,710 --> 00:05:13,615
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    477 | the character 'a'  to specify
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    478 | which character to match.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    479 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    480 | 104
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    481 | 00:05:13,615 --> 00:05:16,945
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    482 | We do this obviously also with 'b'.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    483 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    484 | 105
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    485 | 00:05:16,945 --> 00:05:19,405
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    486 | And I can do that with
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    487 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    488 | 106
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    489 | 00:05:19,405 --> 00:05:22,975
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    490 | 'c'. So now we have three
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    491 | regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    492 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    493 | 107
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    494 | 00:05:22,975 --> 00:05:25,570
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    495 | If you look very carefully
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    496 | at this definition,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    497 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    498 | 108
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    499 | 00:05:25,570 --> 00:05:27,070
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    500 | you can actually see
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    501 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    502 | 109
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    503 | 00:05:27,070 --> 00:05:29,940
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    504 | these regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    505 | expressions are trees.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    506 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    507 | 110
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    508 | 00:05:29,940 --> 00:05:33,365
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    509 | So no matter what we
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    510 | write down on paper,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    511 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    512 | 111
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    513 | 00:05:33,365 --> 00:05:36,755
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    514 | they are behind the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    515 | scenes always trees.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    516 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    517 | 112
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    518 | 00:05:36,755 --> 00:05:40,010
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    519 | And you can see that
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    520 | actually in this definition.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    521 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    522 | 113
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    523 | 00:05:40,010 --> 00:05:44,330
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    524 | If you define two regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    525 | expressions r1 and r2.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    526 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    527 | 114
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    528 | 00:05:44,330 --> 00:05:49,310
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    529 | They are essentially
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    530 | the alternative of a, b and c.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    531 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    532 | 115
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    533 | 00:05:49,310 --> 00:05:52,760
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    534 | Then this regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    535 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    536 | 116
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    537 | 00:05:52,760 --> 00:05:54,710
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    538 | can match either the character
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    539 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    540 | 117
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    541 | 00:05:54,710 --> 00:05:57,980
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    542 | a or the character b
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    543 | or the character c.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    544 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    545 | 118
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    546 | 00:05:57,980 --> 00:06:01,640
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    547 | And the same for the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    548 | regular expression r2.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    549 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    550 | 119
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    551 | 00:06:01,640 --> 00:06:03,875
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    552 | So let me just evaluate that.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    553 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    554 | 120
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    555 | 00:06:03,875 --> 00:06:05,690
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    556 | And even though these are
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    557 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    558 | 121
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    559 | 00:06:05,690 --> 00:06:07,175
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    560 | two regular expressions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    561 | which can match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    562 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    563 | 122
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    564 | 00:06:07,175 --> 00:06:11,750
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    565 | exactly the same things,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    566 | they a different trees.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    567 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    568 | 123
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    569 | 00:06:11,750 --> 00:06:14,195
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    570 | So if I ask Scala,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    571 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    572 | 124
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    573 | 00:06:14,195 --> 00:06:16,460
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    574 | are these trees different?
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    575 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    576 | 125
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    577 | 00:06:16,460 --> 00:06:19,250
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    578 | Or ask if they're
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    579 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    580 | 126
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    581 | 00:06:19,250 --> 00:06:21,865
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    582 | the same, then Scala will say No,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    583 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    584 | 127
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    585 | 00:06:21,865 --> 00:06:25,440
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    586 | they actually different trees.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    587 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    588 | 128
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    589 | 00:06:25,450 --> 00:06:28,459
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    590 | Let's come back to
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    591 | this definition.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    592 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    593 | 129
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    594 | 00:06:28,459 --> 00:06:31,760
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    595 | If we want to write down
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    596 | regular expressions on paper,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    597 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    598 | 130
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    599 | 00:06:31,760 --> 00:06:33,620
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    600 | then we want to be sloppy as
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    601 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    602 | 131
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    603 | 00:06:33,620 --> 00:06:35,750
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    604 | mathematicians rather than as
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    605 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    606 | 132
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    607 | 00:06:35,750 --> 00:06:37,745
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    608 | precise as computer scientists.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    609 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    610 | 133
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    611 | 00:06:37,745 --> 00:06:40,490
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    612 | So when we want to write down
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    613 | a regular expression which can
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    614 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    615 | 134
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    616 | 00:06:40,490 --> 00:06:43,955
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    617 | either match the character
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    618 | a or the character b,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    619 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    620 | 135
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    621 | 00:06:43,955 --> 00:06:49,130
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    622 | then we would write down
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    623 | something like this, a plus b.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    624 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    625 | 136
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    626 | 00:06:49,130 --> 00:06:51,170
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    627 | And if you want to have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    628 | the regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    629 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    630 | 137
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    631 | 00:06:51,170 --> 00:06:52,625
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    632 | which can either match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    633 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    634 | 138
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    635 | 00:06:52,625 --> 00:06:55,925
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    636 | the character a or b or c,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    637 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    638 | 139
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    639 | 00:06:55,925 --> 00:06:58,340
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    640 | we will write
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    641 | something like this.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    642 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    643 | 140
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    644 | 00:06:58,340 --> 00:07:01,370
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    645 | But of course behind the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    646 | scenes, these are trees.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    647 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    648 | 141
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    649 | 00:07:01,370 --> 00:07:04,460
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    650 | So we should have written
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    651 | them with parentheses.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    652 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    653 | 142
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    654 | 00:07:04,460 --> 00:07:06,440
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    655 | And you can see
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    656 | actually, there are two
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    657 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    658 | 143
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    659 | 00:07:06,440 --> 00:07:08,990
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    660 | regular expressions I
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    661 | could have written down.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    662 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    663 | 144
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    664 | 00:07:08,990 --> 00:07:11,270
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    665 | They're different.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    666 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    667 | 145
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    668 | 00:07:11,270 --> 00:07:12,710
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    669 | Just by convention,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    670 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    671 | 146
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    672 | 00:07:12,710 --> 00:07:15,575
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    673 | we on't write these parentheses.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    674 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    675 | 147
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    676 | 00:07:15,575 --> 00:07:18,740
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    677 | And that is similar with sequences.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    678 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    679 | 148
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    680 | 00:07:18,740 --> 00:07:20,000
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    681 | If I want to write down
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    682 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    683 | 149
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    684 | 00:07:20,000 --> 00:07:22,955
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    685 | the regular expression which
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    686 | can match first an 'a',
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    687 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    688 | 150
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    689 | 00:07:22,955 --> 00:07:25,010
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    690 | then a 'b', and then a 'c',
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    691 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    692 | 151
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    693 | 00:07:25,010 --> 00:07:28,160
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    694 | then I would write down
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    695 | something like this.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    696 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    697 | 152
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    698 | 00:07:28,160 --> 00:07:32,120
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    699 | Just, there are again
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    700 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    701 | 153
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    702 | 00:07:32,120 --> 00:07:35,735
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    703 | two regular expressions I
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    704 | could have written down.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    705 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    706 | 154
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    707 | 00:07:35,735 --> 00:07:38,480
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    708 | Again by convention we don't
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    709 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    710 | 155
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    711 | 00:07:38,480 --> 00:07:40,670
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    712 | write these parentheses though.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    713 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    714 | 156
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    715 | 00:07:40,670 --> 00:07:42,350
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    716 | However, sometimes we have to be
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    717 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    718 | 157
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    719 | 00:07:42,350 --> 00:07:43,940
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    720 | very careful with parentheses,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    721 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    722 | 158
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    723 | 00:07:43,940 --> 00:07:47,195
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    724 | especially with star. 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    725 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    726 | 159
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    727 | 00:07:47,195 --> 00:07:50,525
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    728 | Because this regular expression
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    729 | is definitely not
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    730 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    731 | 160
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    732 | 00:07:50,525 --> 00:07:54,900
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    733 | the same as this regular expression.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    734 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    735 | 161
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    736 | 00:07:56,100 --> 00:07:59,410
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    737 | The first one here can match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    738 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    739 | 162
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    740 | 00:07:59,410 --> 00:08:03,610
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    741 | any strings containing a or b's.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    742 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    743 | 163
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    744 | 00:08:03,610 --> 00:08:05,860
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    745 | While this regular expression can
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    746 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    747 | 164
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    748 | 00:08:05,860 --> 00:08:07,945
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    749 | only match the single character
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    750 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    751 | 165
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    752 | 00:08:07,945 --> 00:08:13,300
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    753 | a or any string
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    754 | containing only b's.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    755 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    756 | 166
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    757 | 00:08:13,300 --> 00:08:15,265
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    758 | So to make the difference clear,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    759 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    760 | 167
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    761 | 00:08:15,265 --> 00:08:20,065
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    762 | in this example, we would have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    763 | to use the parentheses.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    764 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    765 | 168
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    766 | 00:08:20,065 --> 00:08:23,140
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    767 | There's one more issue
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    768 | with this definition.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    769 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    770 | 169
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    771 | 00:08:23,140 --> 00:08:26,635
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    772 | Why do we focus on these
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    773 | basic regular expressions?
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    774 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    775 | 170
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    776 | 00:08:26,635 --> 00:08:28,660
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    777 | Why don't we also include
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    778 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    779 | 171
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    780 | 00:08:28,660 --> 00:08:31,285
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    781 | the ones from the
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    782 | extended regular expressions.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    783 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    784 | 172
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    785 | 00:08:31,285 --> 00:08:33,055
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    786 | The answers very easy.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    787 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    788 | 173
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    789 | 00:08:33,055 --> 00:08:35,680
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    790 | These basic regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    791 | expressions can be used
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    792 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    793 | 174
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    794 | 00:08:35,680 --> 00:08:38,370
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    795 | to represent also
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    796 | the extended ones.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    797 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    798 | 175
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    799 | 00:08:38,370 --> 00:08:40,220
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    800 | Let me give you some examples.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    801 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    802 | 176
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    803 | 00:08:40,220 --> 00:08:44,225
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    804 | If I have a regular
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    805 | expression r+, for example,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    806 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    807 | 177
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    808 | 00:08:44,225 --> 00:08:46,280
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    809 | then the meaning
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    810 | was I have to use
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    811 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    812 | 178
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    813 | 00:08:46,280 --> 00:08:49,115
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    814 | at least one or more copies
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    815 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    816 | 179
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    817 | 00:08:49,115 --> 00:08:51,200
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    818 | of this r to
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    819 | match a string. 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    820 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    821 | 180
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    822 | 00:08:51,200 --> 00:08:53,810
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    823 | Well, one or more copies
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    824 | can be represented by
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    825 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    826 | 181
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    827 | 00:08:53,810 --> 00:08:58,385
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    828 | the basic ones as just
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    829 | r followed by r*.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    830 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    831 | 182
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    832 | 00:08:58,385 --> 00:09:01,760
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    833 | Meaning I have to use one
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    834 | copy of r, followed by
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    835 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    836 | 183
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    837 | 00:09:01,760 --> 00:09:05,150
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    838 | 0 or more copies of r.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    839 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    840 | 184
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    841 | 00:09:05,150 --> 00:09:07,895
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    842 | Similarly, if I have the optional
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    843 | regular expression,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    844 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    845 | 185
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    846 | 00:09:07,895 --> 00:09:10,715
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    847 | which is supposed to
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    848 | match a string
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    849 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    850 | 186
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    851 | 00:09:10,715 --> 00:09:13,865
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    852 | by using r, or match
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    853 | the empty string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    854 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    855 | 187
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    856 | 00:09:13,865 --> 00:09:19,295
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    857 | Then this can be obviously
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    858 | defined as r + 1.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    859 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    860 | 188
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    861 | 00:09:19,295 --> 00:09:23,945
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    862 | So here is the bold
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    863 | regular expression 1,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    864 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    865 | 189
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    866 | 00:09:23,945 --> 00:09:26,180
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    867 | which means it either can
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    868 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    869 | 190
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    870 | 00:09:26,180 --> 00:09:28,205
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    871 | recognize whatever
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    872 | r can recognize,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    873 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    874 | 191
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    875 | 00:09:28,205 --> 00:09:30,470
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    876 | or it can recognize
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    877 | the empty string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    878 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    879 | 192
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    880 | 00:09:30,470 --> 00:09:35,150
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    881 | And if I have ranges, like a
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    882 | to z,  then I can define
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    883 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    884 | 193
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    885 | 00:09:35,150 --> 00:09:41,135
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    886 | that as a + b + c + ...
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    887 | and so on until z.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    888 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    889 | 194
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    890 | 00:09:41,135 --> 00:09:45,920
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    891 | Maybe this definition is not
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    892 | good in terms of runtime,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    893 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    894 | 195
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    895 | 00:09:45,920 --> 00:09:47,960
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    896 | but in terms of just being able
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    897 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    898 | 196
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    899 | 00:09:47,960 --> 00:09:50,780
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    900 | to recognize strings
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    901 | or match strings,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    902 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    903 | 197
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    904 | 00:09:50,780 --> 00:09:54,680
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    905 | the basic regular expressions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    906 | will be just sufficient.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    907 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    908 | 198
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    909 | 00:09:54,680 --> 00:09:56,690
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    910 | Unfortunately, we
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    911 | also need to have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    912 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    913 | 199
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    914 | 00:09:56,690 --> 00:09:58,850
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    915 | a quick chat about strings.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    916 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    917 | 200
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    918 | 00:09:58,850 --> 00:10:02,255
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    919 | In Scala, it's crystal
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    920 | clear what a string is.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    921 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    922 | 201
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    923 | 00:10:02,255 --> 00:10:05,480
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    924 | There's a separate datatype
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    925 | which is called string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    926 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    927 | 202
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    928 | 00:10:05,480 --> 00:10:07,895
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    929 | So here, for example,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    930 | is a string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    931 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    932 | 203
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    933 | 00:10:07,895 --> 00:10:09,200
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    934 | And as you can see,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    935 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    936 | 204
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    937 | 00:10:09,200 --> 00:10:11,105
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    938 | it is of the type string.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    939 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    940 | 205
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    941 | 00:10:11,105 --> 00:10:13,985
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    942 | And the empty string
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    943 | will be just that.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    944 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    945 | 206
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    946 | 00:10:13,985 --> 00:10:16,160
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    947 | However, when we write things down on
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    948 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    949 | 207
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    950 | 00:10:16,160 --> 00:10:18,320
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    951 | paper and think
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    952 | about our algorithm,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    953 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    954 | 208
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    955 | 00:10:18,320 --> 00:10:22,790
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    956 | we want to think of strings
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    957 | as lists of characters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    958 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    959 | 209
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    960 | 00:10:22,790 --> 00:10:26,070
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    961 | So more something like this.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    962 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    963 | 210
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    964 | 00:10:27,070 --> 00:10:31,745
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    965 | You can see here, this is actually
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    966 | a list of characters.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    967 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    968 | 211
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    969 | 00:10:31,745 --> 00:10:35,150
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    970 | And the two operations
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    971 | we need are taking
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    972 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    973 | 212
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    974 | 00:10:35,150 --> 00:10:37,280
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    975 | the head of this list and
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    976 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    977 | 213
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    978 | 00:10:37,280 --> 00:10:39,770
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    979 | the rest of the list
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    980 | or tail of the list.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    981 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    982 | 214
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    983 | 00:10:39,770 --> 00:10:41,720
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    984 | That's why we want
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    985 | to regard them as
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    986 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    987 | 215
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    988 | 00:10:41,720 --> 00:10:45,260
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    989 | lists rather than strings.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    990 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    991 | 216
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    992 | 00:10:45,260 --> 00:10:48,200
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    993 | So if I'm using a
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    994 | string like this,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    995 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    996 | 217
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    997 | 00:10:48,200 --> 00:10:51,935
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    998 | then on paper I always will
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |    999 | write something like that.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1000 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1001 | 218
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1002 | 00:10:51,935 --> 00:10:54,575
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1003 | Or since I'm lazy, just that.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1004 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1005 | 219
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1006 | 00:10:54,575 --> 00:10:56,675
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1007 | And for the empty string,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1008 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1009 | 220
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1010 | 00:10:56,675 --> 00:10:59,210
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1011 | I will write either
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1012 | the empty list, with
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1013 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1014 | 221
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1015 | 00:10:59,210 --> 00:11:03,920
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1016 | two brackets or,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1017 | being lazy, just that.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1018 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1019 | 222
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1020 | 00:11:03,920 --> 00:11:06,620
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1021 | Actually there is one
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1022 | more operation we need on
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1023 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1024 | 223
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1025 | 00:11:06,620 --> 00:11:09,410
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1026 | strings and that
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1027 | is concatenation.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1028 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1029 | 224
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1030 | 00:11:09,410 --> 00:11:11,255
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1031 | If you have a string s1,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1032 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1033 | 225
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1034 | 00:11:11,255 --> 00:11:14,510
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1035 | string s2, and put an
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1036 | at symbol in between,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1037 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1038 | 226
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1039 | 00:11:14,510 --> 00:11:18,050
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1040 | that means we want to
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1041 | concatenate both strings.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1042 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1043 | 227
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1044 | 00:11:18,050 --> 00:11:22,625
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1045 | So foo concatenated with
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1046 | bar, would be foobar.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1047 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1048 | 228
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1049 | 00:11:22,625 --> 00:11:25,085
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1050 | And any string concatenated with
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1051 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1052 | 229
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1053 | 00:11:25,085 --> 00:11:27,950
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1054 | the empty string
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1055 | is left untouched.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1056 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1057 | 230
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1058 | 00:11:27,950 --> 00:11:31,310
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1059 | So baz concatenated with
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1060 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1061 | 231
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1062 | 00:11:31,310 --> 00:11:33,545
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1063 | the empty string, is just baz.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1064 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1065 | 232
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1066 | 00:11:33,545 --> 00:11:37,295
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1067 | So that's like if we have
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1068 | strings as lists of characters,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1069 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1070 | 233
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1071 | 00:11:37,295 --> 00:11:39,755
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1072 | that will be just list append.
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1073 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1074 | 234
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1075 | 00:11:39,755 --> 00:11:41,480
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1076 | In the next video,
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1077 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1078 | 235
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1079 | 00:11:41,480 --> 00:11:43,160
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1080 | we will use these definitions
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1081 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1082 | 236
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1083 | 00:11:43,160 --> 00:11:45,050
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1084 | and introduce the notion of what
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1085 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1086 | 237
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1087 | 00:11:45,050 --> 00:11:46,850
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1088 | a language is and
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1089 | 
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1090 | 238
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1091 | 00:11:46,850 --> 00:11:49,920
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1092 | what the meaning of a
 | 
| 
Christian Urban <christian.urban@kcl.ac.uk> parents: diff
changeset |   1093 | regular expression is.
 |