videos/02-cw1.srt
changeset 774 42733adf2a48
equal deleted inserted replaced
773:260e330f7466 774:42733adf2a48
       
     1 1
       
     2 00:00:06,410 --> 00:00:09,390
       
     3 The final video for this week.
       
     4 
       
     5 2
       
     6 00:00:09,390 --> 00:00:12,465
       
     7 And let me say something
       
     8 about the coursework.
       
     9 
       
    10 3
       
    11 00:00:12,465 --> 00:00:15,300
       
    12 First. You can solve
       
    13 
       
    14 4
       
    15 00:00:15,300 --> 00:00:17,745
       
    16 the coursework in any programming
       
    17 language you're like,
       
    18 
       
    19 5
       
    20 00:00:17,745 --> 00:00:22,440
       
    21 I already said that. You
       
    22 have to submit a PDF file.
       
    23 
       
    24 6
       
    25 00:00:22,440 --> 00:00:23,850
       
    26 There will be some questions,
       
    27 
       
    28 7
       
    29 00:00:23,850 --> 00:00:26,250
       
    30 so you have to write down the
       
    31 solution to this questions.
       
    32 
       
    33 8
       
    34 00:00:26,250 --> 00:00:30,315
       
    35 Please use a PDF and you have
       
    36 to submit your source code.
       
    37 
       
    38 9
       
    39 00:00:30,315 --> 00:00:35,580
       
    40 And yes, if you do use a
       
    41 language which isn't Scala,
       
    42 
       
    43 10
       
    44 00:00:35,580 --> 00:00:39,450
       
    45 it might help to tell me
       
    46 how I can run your code.
       
    47 
       
    48 11
       
    49 00:00:39,450 --> 00:00:42,550
       
    50 If I can't run your code,
       
    51 I will ask you anyway.
       
    52 
       
    53 12
       
    54 00:00:42,550 --> 00:00:50,044
       
    55 Also, please submit both the
       
    56 PDF and the code in a file,
       
    57 
       
    58 13
       
    59 00:00:50,044 --> 00:00:52,730
       
    60 in a zip file, which generates
       
    61 
       
    62 14
       
    63 00:00:52,730 --> 00:00:55,835
       
    64 a subdirectory with your
       
    65 name and your family name.
       
    66 
       
    67 15
       
    68 00:00:55,835 --> 00:00:58,970
       
    69 That we'll just save a
       
    70 lot of time for me to try
       
    71 
       
    72 16
       
    73 00:00:58,970 --> 00:01:02,030
       
    74 to figure out who
       
    75 has submitted what.
       
    76 
       
    77 17
       
    78 00:01:02,030 --> 00:01:04,445
       
    79 Please do that.
       
    80 
       
    81 18
       
    82 00:01:04,445 --> 00:01:07,789
       
    83 So what is the task
       
    84 into coursework?
       
    85 
       
    86 19
       
    87 00:01:07,789 --> 00:01:09,885
       
    88 I essentially showed you how
       
    89 
       
    90 20
       
    91 00:01:09,885 --> 00:01:11,995
       
    92 the Brzozowski matcher
       
    93 
       
    94 21
       
    95 00:01:11,995 --> 00:01:14,645
       
    96 works for the basic
       
    97 regular expressions.
       
    98 
       
    99 22
       
   100 00:01:14,645 --> 00:01:16,295
       
   101 I also showed you actually how it
       
   102 
       
   103 23
       
   104 00:01:16,295 --> 00:01:18,110
       
   105 works for the n-times.
       
   106 
       
   107 24
       
   108 00:01:18,110 --> 00:01:20,300
       
   109 And the task in coursework
       
   110 
       
   111 25
       
   112 00:01:20,300 --> 00:01:22,970
       
   113 is that you extend the
       
   114 Brzozowski matcher to
       
   115 
       
   116 26
       
   117 00:01:22,970 --> 00:01:25,820
       
   118 the other regular expressions
       
   119 
       
   120 27
       
   121 00:01:25,820 --> 00:01:27,800
       
   122 from the extended
       
   123 regular expressions.
       
   124 
       
   125 28
       
   126 00:01:27,800 --> 00:01:30,245
       
   127 So you're supposed
       
   128 to extended with
       
   129 
       
   130 29
       
   131 00:01:30,245 --> 00:01:34,490
       
   132 r-plus, optional r, for n-times.
       
   133 
       
   134 30
       
   135 00:01:34,490 --> 00:01:38,420
       
   136 You've already seen that.
       
   137 For 0 or more times,
       
   138 
       
   139 31
       
   140 00:01:38,420 --> 00:01:41,120
       
   141 but not more than m
       
   142 times regular expression.
       
   143 
       
   144 32
       
   145 00:01:41,120 --> 00:01:45,890
       
   146 For at least n-times regular
       
   147 expression and for between
       
   148 
       
   149 33
       
   150 00:01:45,890 --> 00:01:47,480
       
   151 n times and m times
       
   152 
       
   153 34
       
   154 00:01:47,480 --> 00:01:50,810
       
   155 regular expression and also this
       
   156 NOT regular expression.
       
   157 
       
   158 35
       
   159 00:01:50,810 --> 00:01:52,790
       
   160 So your task is to
       
   161 
       
   162 36
       
   163 00:01:52,790 --> 00:01:55,310
       
   164 essentially find the definition
       
   165 
       
   166 37
       
   167 00:01:55,310 --> 00:01:57,155
       
   168 for nullable in these cases.
       
   169 
       
   170 38
       
   171 00:01:57,155 --> 00:02:00,215
       
   172 Find a definition for derivative,
       
   173 
       
   174 39
       
   175 00:02:00,215 --> 00:02:02,480
       
   176 implement them,
       
   177 write them down in
       
   178 
       
   179 40
       
   180 00:02:02,480 --> 00:02:06,065
       
   181 a PDF and then do some
       
   182 experiments with them.
       
   183 
       
   184 41
       
   185 00:02:06,065 --> 00:02:08,875
       
   186 Well, how can you do that?
       
   187 
       
   188 42
       
   189 00:02:08,875 --> 00:02:10,370
       
   190 Well I've given you
       
   191 
       
   192 43
       
   193 00:02:10,370 --> 00:02:13,565
       
   194 the meaning of these additional
       
   195 regular expressions.
       
   196 
       
   197 44
       
   198 00:02:13,565 --> 00:02:16,730
       
   199 So here's, for example, the
       
   200 meaning of this r-plus.
       
   201 
       
   202 45
       
   203 00:02:16,730 --> 00:02:18,290
       
   204 So that would be, I have
       
   205 
       
   206 46
       
   207 00:02:18,290 --> 00:02:22,115
       
   208 at least one copy up to infinity.
       
   209 
       
   210 47
       
   211 00:02:22,115 --> 00:02:25,070
       
   212 And the optional-r would be it's
       
   213 
       
   214 48
       
   215 00:02:25,070 --> 00:02:28,370
       
   216 the language of r plus
       
   217 the empty string.
       
   218 
       
   219 49
       
   220 00:02:28,370 --> 00:02:30,440
       
   221 If I have it exactly n times,
       
   222 
       
   223 50
       
   224 00:02:30,440 --> 00:02:31,985
       
   225 then it would be
       
   226 just the language
       
   227 
       
   228 51
       
   229 00:02:31,985 --> 00:02:34,010
       
   230 of r exactly n-times.
       
   231 
       
   232 52
       
   233 00:02:34,010 --> 00:02:38,105
       
   234 And here you have union
       
   235 from 0 to m and so on.
       
   236 
       
   237 53
       
   238 00:02:38,105 --> 00:02:42,560
       
   239 This might be a slightly
       
   240 interesting regular expression.
       
   241 
       
   242 54
       
   243 00:02:42,560 --> 00:02:46,580
       
   244 So that's supposed to be the
       
   245 character set that is very
       
   246 
       
   247 55
       
   248 00:02:46,580 --> 00:02:48,410
       
   249 similar to character ranges like
       
   250 
       
   251 56
       
   252 00:02:48,410 --> 00:02:51,005
       
   253 in the existing regular
       
   254 expression matchers.
       
   255 
       
   256 57
       
   257 00:02:51,005 --> 00:02:52,820
       
   258 So this just says
       
   259 
       
   260 58
       
   261 00:02:52,820 --> 00:02:55,565
       
   262 ...this regular
       
   263 expression can match
       
   264 
       
   265 59
       
   266 00:02:55,565 --> 00:03:00,860
       
   267 either the character c1 or
       
   268 the character c2 up to cn.
       
   269 
       
   270 60
       
   271 00:03:00,860 --> 00:03:03,620
       
   272 Why do I include
       
   273 that regular expression?
       
   274 
       
   275 61
       
   276 00:03:03,620 --> 00:03:09,050
       
   277 I could also just say
       
   278 c1 plus c2 plus c3...
       
   279 
       
   280 62
       
   281 00:03:09,050 --> 00:03:12,889
       
   282 a big alternative.
       
   283 Well that's possible.
       
   284 
       
   285 63
       
   286 00:03:12,889 --> 00:03:16,595
       
   287 But remember the Achilles'
       
   288 heel of this algorithm is,
       
   289 
       
   290 64
       
   291 00:03:16,595 --> 00:03:18,425
       
   292 if the regular expression is large,
       
   293 
       
   294 65
       
   295 00:03:18,425 --> 00:03:20,825
       
   296 then the computation
       
   297 take always a long time.
       
   298 
       
   299 66
       
   300 00:03:20,825 --> 00:03:23,840
       
   301 So I'm trying to compress
       
   302 that as much as I can.
       
   303 
       
   304 67
       
   305 00:03:23,840 --> 00:03:25,370
       
   306 So instead of giving a big
       
   307 
       
   308 68
       
   309 00:03:25,370 --> 00:03:29,134
       
   310 alternative, I am trying to
       
   311 give one regular expression,
       
   312 
       
   313 69
       
   314 00:03:29,134 --> 00:03:31,340
       
   315 which can not just match
       
   316 a single character,
       
   317 
       
   318 70
       
   319 00:03:31,340 --> 00:03:34,230
       
   320 but a set of characters.
       
   321 
       
   322 71
       
   323 00:03:34,630 --> 00:03:36,980
       
   324 How can you be sure that
       
   325 
       
   326 72
       
   327 00:03:36,980 --> 00:03:39,215
       
   328 definition you come
       
   329 up with are correct?
       
   330 
       
   331 73
       
   332 00:03:39,215 --> 00:03:41,450
       
   333 Well, I showed you which are
       
   334 
       
   335 74
       
   336 00:03:41,450 --> 00:03:45,170
       
   337 the properties these two
       
   338 functions need to satisfy.
       
   339 
       
   340 75
       
   341 00:03:45,170 --> 00:03:47,060
       
   342 I've given you here what
       
   343 
       
   344 76
       
   345 00:03:47,060 --> 00:03:49,325
       
   346 the meaning of these
       
   347 expressions are.
       
   348 
       
   349 77
       
   350 00:03:49,325 --> 00:03:52,700
       
   351 So you will always know what's
       
   352 on the right-hand side.
       
   353 
       
   354 78
       
   355 00:03:52,700 --> 00:03:54,515
       
   356 This is completely determined.
       
   357 
       
   358 79
       
   359 00:03:54,515 --> 00:03:56,720
       
   360 You essentially
       
   361 have to fill something
       
   362 
       
   363 80
       
   364 00:03:56,720 --> 00:03:58,910
       
   365 equivalent on the left-hand side.
       
   366 
       
   367 81
       
   368 00:03:58,910 --> 00:04:02,105
       
   369 That's the main task
       
   370 of the coursework.
       
   371 
       
   372 82
       
   373 00:04:02,105 --> 00:04:08,090
       
   374 And you can start from the
       
   375 files I provided on KEATS.
       
   376 
       
   377 83
       
   378 00:04:08,090 --> 00:04:12,125
       
   379 So you would just have to
       
   380 add these additional cases.
       
   381 
       
   382 84
       
   383 00:04:12,125 --> 00:04:15,020
       
   384 When you add these
       
   385 additional cases
       
   386 
       
   387 85
       
   388 00:04:15,020 --> 00:04:17,330
       
   389 and you're using the Scala language,
       
   390 
       
   391 86
       
   392 00:04:17,330 --> 00:04:18,980
       
   393 you can do me a favour.
       
   394 
       
   395 87
       
   396 00:04:18,980 --> 00:04:22,550
       
   397 You can call these
       
   398 constructors for
       
   399 
       
   400 88
       
   401 00:04:22,550 --> 00:04:25,400
       
   402 these different regular
       
   403 expressions as RANGE,
       
   404 
       
   405 89
       
   406 00:04:25,400 --> 00:04:28,985
       
   407 PLUS, OPTIONAL and NTIMES,
       
   408 UPTO, FROM and BETWEEN.
       
   409 
       
   410 90
       
   411 00:04:28,985 --> 00:04:31,370
       
   412 Remember I have this
       
   413 convention to always
       
   414 
       
   415 91
       
   416 00:04:31,370 --> 00:04:34,025
       
   417 use capital letters
       
   418 for regular expressions.
       
   419 
       
   420 92
       
   421 00:04:34,025 --> 00:04:36,680
       
   422 It would be nice if you could use
       
   423 
       
   424 93
       
   425 00:04:36,680 --> 00:04:39,260
       
   426 these names because
       
   427 then it will be
       
   428 
       
   429 94
       
   430 00:04:39,260 --> 00:04:42,335
       
   431 very easy for me
       
   432 to test your code.
       
   433 
       
   434 95
       
   435 00:04:42,335 --> 00:04:44,690
       
   436 If you're using a different
       
   437 programming language
       
   438 
       
   439 96
       
   440 00:04:44,690 --> 00:04:46,370
       
   441 like let's say Rust,
       
   442 
       
   443 97
       
   444 00:04:46,370 --> 00:04:48,860
       
   445 I expect some people will use that, where I
       
   446 
       
   447 98
       
   448 00:04:48,860 --> 00:04:51,380
       
   449 have no idea what are the
       
   450 conventions in your language,
       
   451 
       
   452 99
       
   453 00:04:51,380 --> 00:04:53,420
       
   454 how you have to call
       
   455 these constructors,
       
   456 
       
   457 100
       
   458 00:04:53,420 --> 00:04:56,420
       
   459 we will see whatever you
       
   460 submit. I'll have a look.
       
   461 
       
   462 101
       
   463 00:04:56,420 --> 00:04:59,360
       
   464 There's one more
       
   465 constraint I have to
       
   466 
       
   467 102
       
   468 00:04:59,360 --> 00:05:02,194
       
   469 impose to make this
       
   470 coursework interesting.
       
   471 
       
   472 103
       
   473 00:05:02,194 --> 00:05:04,730
       
   474 I do not want you 
       
   475 that you just take
       
   476 
       
   477 104
       
   478 00:05:04,730 --> 00:05:07,370
       
   479 these extended regular
       
   480 expressions and that you
       
   481 
       
   482 105
       
   483 00:05:07,370 --> 00:05:10,550
       
   484 expand them using
       
   485 their definition.
       
   486 
       
   487 106
       
   488 00:05:10,550 --> 00:05:12,320
       
   489 Because that would make the regular
       
   490 
       
   491 107
       
   492 00:05:12,320 --> 00:05:13,955
       
   493 expression matcher very slow.
       
   494 
       
   495 108
       
   496 00:05:13,955 --> 00:05:16,160
       
   497 So for example, if
       
   498 you want to define
       
   499 
       
   500 109
       
   501 00:05:16,160 --> 00:05:18,785
       
   502 what's the derivative for r-plus,
       
   503 
       
   504 110
       
   505 00:05:18,785 --> 00:05:21,560
       
   506 then what does not
       
   507 count as a solution:
       
   508 
       
   509 111
       
   510 00:05:21,560 --> 00:05:24,770
       
   511 if you just expand that
       
   512 to the definition that r
       
   513 
       
   514 112
       
   515 00:05:24,770 --> 00:05:28,935
       
   516 plus is equivalent to
       
   517 r followed by r-star.
       
   518 
       
   519 113
       
   520 00:05:28,935 --> 00:05:32,845
       
   521 So in code, what I
       
   522 would like to not see,
       
   523 
       
   524 114
       
   525 00:05:32,845 --> 00:05:35,680
       
   526 I would not give any
       
   527 marks for that is,
       
   528 
       
   529 115
       
   530 00:05:35,680 --> 00:05:37,780
       
   531 if you add the plus as follows,
       
   532 
       
   533 116
       
   534 00:05:37,780 --> 00:05:39,910
       
   535 so that is still perfectly fine.
       
   536 
       
   537 117
       
   538 00:05:39,910 --> 00:05:42,160
       
   539 There's nothing you
       
   540 can do differently.
       
   541 
       
   542 118
       
   543 00:05:42,160 --> 00:05:44,065
       
   544 That would be the constructor.
       
   545 
       
   546 119
       
   547 00:05:44,065 --> 00:05:46,480
       
   548 But when you define nullable,
       
   549 
       
   550 120
       
   551 00:05:46,480 --> 00:05:49,180
       
   552 I do not want to see
       
   553 that you defined
       
   554 
       
   555 121
       
   556 00:05:49,180 --> 00:05:51,790
       
   557 this plus r as nullable
       
   558 
       
   559 122
       
   560 00:05:51,790 --> 00:05:54,385
       
   561 of the sequence of r and star-r,
       
   562 
       
   563 123
       
   564 00:05:54,385 --> 00:05:58,075
       
   565 just unfolding
       
   566 the definition.
       
   567 
       
   568 124
       
   569 00:05:58,075 --> 00:06:00,415
       
   570 As you can see, when you come
       
   571 
       
   572 125
       
   573 00:06:00,415 --> 00:06:02,815
       
   574 up with a much better
       
   575 solution for that.
       
   576 
       
   577 126
       
   578 00:06:02,815 --> 00:06:05,110
       
   579 This is a very inefficient way
       
   580 
       
   581 127
       
   582 00:06:05,110 --> 00:06:07,090
       
   583 for how to calculate nullable.
       
   584 
       
   585 128
       
   586 00:06:07,090 --> 00:06:10,825
       
   587 And so the same for derivative
       
   588 would not be allowed.
       
   589 
       
   590 129
       
   591 00:06:10,825 --> 00:06:13,895
       
   592 If you, I have the plus r here,
       
   593 
       
   594 130
       
   595 00:06:13,895 --> 00:06:16,685
       
   596 you can't just unfold
       
   597 the definition,
       
   598 
       
   599 131
       
   600 00:06:16,685 --> 00:06:19,790
       
   601 with r-plus
       
   602 being defined as r
       
   603 
       
   604 132
       
   605 00:06:19,790 --> 00:06:21,350
       
   606 followed by r star and
       
   607 
       
   608 133
       
   609 00:06:21,350 --> 00:06:23,375
       
   610 then just build the
       
   611 derivative of that.
       
   612 
       
   613 134
       
   614 00:06:23,375 --> 00:06:25,280
       
   615 You have to find something more
       
   616 
       
   617 135
       
   618 00:06:25,280 --> 00:06:28,730
       
   619 primitive that involves
       
   620 only the derivative of r,
       
   621 
       
   622 136
       
   623 00:06:28,730 --> 00:06:31,790
       
   624 essentially, nothing
       
   625 more involved. The same
       
   626 
       
   627 137
       
   628 00:06:31,790 --> 00:06:33,815
       
   629 if you have n-times, for example,
       
   630 
       
   631 138
       
   632 00:06:33,815 --> 00:06:39,935
       
   633 you can't just unfold this
       
   634 n-times in this sequence
       
   635 
       
   636 139
       
   637 00:06:39,935 --> 00:06:43,310
       
   638 of .... n-copies
       
   639 
       
   640 140
       
   641 00:06:43,310 --> 00:06:47,285
       
   642 of this r. You have to get
       
   643 something more primitive.
       
   644 
       
   645 141
       
   646 00:06:47,285 --> 00:06:49,760
       
   647 Otherwise, as you've
       
   648 seen earlier,
       
   649 
       
   650 142
       
   651 00:06:49,760 --> 00:06:53,135
       
   652 your regular expression matcher
       
   653 would be totally slow.
       
   654 
       
   655 143
       
   656 00:06:53,135 --> 00:06:55,475
       
   657 When you submit your solution,
       
   658 
       
   659 144
       
   660 00:06:55,475 --> 00:06:58,445
       
   661 please submit this
       
   662 solution in the PDF.
       
   663 
       
   664 145
       
   665 00:06:58,445 --> 00:07:01,580
       
   666 So give the cases of your definition
       
   667 
       
   668 146
       
   669 00:07:01,580 --> 00:07:06,004
       
   670 in a form like that for
       
   671 nullable and derivative.
       
   672 
       
   673 147
       
   674 00:07:06,004 --> 00:07:08,405
       
   675 And also implement it in code.
       
   676 
       
   677 148
       
   678 00:07:08,405 --> 00:07:10,910
       
   679 That is just helping me to
       
   680 
       
   681 149
       
   682 00:07:10,910 --> 00:07:13,400
       
   683 find out whether you have
       
   684 the correct solution or not.
       
   685 
       
   686 150
       
   687 00:07:13,400 --> 00:07:15,710
       
   688 So you needed a kind of
       
   689 mathematical notation of
       
   690 
       
   691 151
       
   692 00:07:15,710 --> 00:07:18,695
       
   693 your solution, and
       
   694 an actual code.
       
   695 
       
   696 152
       
   697 00:07:18,695 --> 00:07:22,414
       
   698 And then once you
       
   699 have your definition,
       
   700 
       
   701 153
       
   702 00:07:22,414 --> 00:07:25,699
       
   703 also make sure you try
       
   704 it out on some examples.
       
   705 
       
   706 154
       
   707 00:07:25,699 --> 00:07:28,970
       
   708 These will be the examples
       
   709 I will definitely try out,
       
   710 
       
   711 155
       
   712 00:07:28,970 --> 00:07:30,560
       
   713 but probably also more
       
   714 
       
   715 156
       
   716 00:07:30,560 --> 00:07:33,215
       
   717 depending on what
       
   718 definitions you give me.
       
   719 
       
   720 157
       
   721 00:07:33,215 --> 00:07:38,300
       
   722 There's one more question I
       
   723 ask you to do. You've seen
       
   724 
       
   725 158
       
   726 00:07:38,300 --> 00:07:40,130
       
   727 there's some regular
       
   728 expressions that
       
   729 
       
   730 159
       
   731 00:07:40,130 --> 00:07:42,215
       
   732 are involved for characters,
       
   733 
       
   734 160
       
   735 00:07:42,215 --> 00:07:46,925
       
   736 for character ranges or
       
   737 character sets essentially.
       
   738 
       
   739 161
       
   740 00:07:46,925 --> 00:07:48,665
       
   741 And sometimes I also want to have
       
   742 
       
   743 162
       
   744 00:07:48,665 --> 00:07:51,665
       
   745 just a reg expression which
       
   746 can match any character!!
       
   747 
       
   748 163
       
   749 00:07:51,665 --> 00:07:56,195
       
   750 And I could have for all of
       
   751 them separate definitions.
       
   752 
       
   753 164
       
   754 00:07:56,195 --> 00:07:57,710
       
   755 And that would mean I also need
       
   756 
       
   757 165
       
   758 00:07:57,710 --> 00:07:59,645
       
   759 separate definitions for nullable,
       
   760 
       
   761 166
       
   762 00:07:59,645 --> 00:08:02,405
       
   763 separate definitions
       
   764 for derivative.
       
   765 
       
   766 167
       
   767 00:08:02,405 --> 00:08:05,825
       
   768 But actually they can be
       
   769 generalized to just one constructor.
       
   770 
       
   771 168
       
   772 00:08:05,825 --> 00:08:08,210
       
   773 And that is if we introduce
       
   774 
       
   775 169
       
   776 00:08:08,210 --> 00:08:11,630
       
   777 a constructor for regular expressions,
       
   778 
       
   779 170
       
   780 00:08:11,630 --> 00:08:13,760
       
   781 which not just takes
       
   782 a single character
       
   783 
       
   784 171
       
   785 00:08:13,760 --> 00:08:15,469
       
   786 or set of characters.
       
   787 
       
   788 172
       
   789 00:08:15,469 --> 00:08:18,245
       
   790 But, which I call here CFUN.
       
   791 
       
   792 173
       
   793 00:08:18,245 --> 00:08:23,165
       
   794 And it takes a function from
       
   795 characters to booleans.
       
   796 
       
   797 174
       
   798 00:08:23,165 --> 00:08:24,470
       
   799 So if I want to match
       
   800 
       
   801 175
       
   802 00:08:24,470 --> 00:08:26,900
       
   803 just a single character
       
   804 then this function f
       
   805 
       
   806 176
       
   807 00:08:26,900 --> 00:08:29,060
       
   808 would only say true
       
   809 
       
   810 177
       
   811 00:08:29,060 --> 00:08:32,225
       
   812 if it gets as argument
       
   813 the single character.
       
   814 
       
   815 178
       
   816 00:08:32,225 --> 00:08:34,850
       
   817 If I have a character set,
       
   818 
       
   819 179
       
   820 00:08:34,850 --> 00:08:36,515
       
   821 then I would say, well,
       
   822 
       
   823 180
       
   824 00:08:36,515 --> 00:08:38,300
       
   825 I need a function
       
   826 which says true
       
   827 
       
   828 181
       
   829 00:08:38,300 --> 00:08:40,970
       
   830 for all the characters
       
   831 in this set.
       
   832 
       
   833 182
       
   834 00:08:40,970 --> 00:08:43,460
       
   835 And otherwise it's false.
       
   836 
       
   837 183
       
   838 00:08:43,460 --> 00:08:46,205
       
   839 And if I want to
       
   840 match any character!!,
       
   841 
       
   842 184
       
   843 00:08:46,205 --> 00:08:48,500
       
   844 then they should get a function
       
   845 
       
   846 185
       
   847 00:08:48,500 --> 00:08:53,450
       
   848 which says true for
       
   849 all characters.
       
   850 
       
   851 186
       
   852 00:08:53,450 --> 00:08:56,630
       
   853 Okay? If I have
       
   854 such a constructor
       
   855 
       
   856 187
       
   857 00:08:56,630 --> 00:09:00,140
       
   858 that actually I can save
       
   859 myself a bit of work.
       
   860 
       
   861 188
       
   862 00:09:00,140 --> 00:09:03,200
       
   863 And I can just have one case
       
   864 
       
   865 189
       
   866 00:09:03,200 --> 00:09:06,920
       
   867 for nullable and one
       
   868 case for CFUNS.
       
   869 
       
   870 190
       
   871 00:09:06,920 --> 00:09:09,800
       
   872 So that would then subsume
       
   873 the character ranges and
       
   874 
       
   875 191
       
   876 00:09:09,800 --> 00:09:13,385
       
   877 the character and also
       
   878 this ALL regular expression.
       
   879 
       
   880 192
       
   881 00:09:13,385 --> 00:09:15,380
       
   882 Ok, this was not explicitly
       
   883 included at the beginning,
       
   884 
       
   885 193
       
   886 00:09:15,380 --> 00:09:17,510
       
   887 but that's what I can now define.
       
   888 
       
   889 194
       
   890 00:09:17,510 --> 00:09:21,740
       
   891 And then I can define
       
   892 this for characters,
       
   893 
       
   894 195
       
   895 00:09:21,740 --> 00:09:23,885
       
   896 just as special cases
       
   897 for these functions.
       
   898 
       
   899 196
       
   900 00:09:23,885 --> 00:09:25,610
       
   901 And I told you already
       
   902 what this function
       
   903 
       
   904 197
       
   905 00:09:25,610 --> 00:09:28,025
       
   906 should look like in
       
   907 these three cases.
       
   908 
       
   909 198
       
   910 00:09:28,025 --> 00:09:30,350
       
   911 So I let ...
       
   912 
       
   913 199
       
   914 00:09:30,350 --> 00:09:33,515
       
   915 you decide how you're
       
   916 going to implement that.
       
   917 
       
   918 200
       
   919 00:09:33,515 --> 00:09:35,450
       
   920 If you first define
       
   921 
       
   922 201
       
   923 00:09:35,450 --> 00:09:38,495
       
   924 your implementation is
       
   925 all these explicit cases.
       
   926 
       
   927 202
       
   928 00:09:38,495 --> 00:09:41,900
       
   929 Because in these cases it's
       
   930 actually more intuitive,
       
   931 
       
   932 203
       
   933 00:09:41,900 --> 00:09:45,980
       
   934 what nullable and
       
   935 derivative should be.
       
   936 
       
   937 204
       
   938 00:09:45,980 --> 00:09:48,035
       
   939 And then in a second step,
       
   940 
       
   941 205
       
   942 00:09:48,035 --> 00:09:51,140
       
   943 you implement these
       
   944 more general cases.
       
   945 
       
   946 206
       
   947 00:09:51,140 --> 00:09:53,360
       
   948 And just keep the original ones
       
   949 
       
   950 207
       
   951 00:09:53,360 --> 00:09:54,500
       
   952 in your implementation if you
       
   953 
       
   954 208
       
   955 00:09:54,500 --> 00:09:57,710
       
   956 want, or if you feel
       
   957 adventurous enough,
       
   958 
       
   959 209
       
   960 00:09:57,710 --> 00:09:59,225
       
   961 and want to be lazy,
       
   962 
       
   963 210
       
   964 00:09:59,225 --> 00:10:01,115
       
   965 that you just implement that.
       
   966 
       
   967 211
       
   968 00:10:01,115 --> 00:10:02,660
       
   969 And then you have already done
       
   970 
       
   971 212
       
   972 00:10:02,660 --> 00:10:06,530
       
   973 at least two constructors
       
   974 by just implementing one.
       
   975 
       
   976 213
       
   977 00:10:06,530 --> 00:10:08,915
       
   978 But as said that 
       
   979 requires a bit skill
       
   980 
       
   981 214
       
   982 00:10:08,915 --> 00:10:11,450
       
   983 of generalizing how
       
   984 that should be.
       
   985 
       
   986 215
       
   987 00:10:11,450 --> 00:10:14,180
       
   988 And the other questions
       
   989 are just then
       
   990 
       
   991 216
       
   992 00:10:14,180 --> 00:10:16,745
       
   993 trying out your
       
   994 expression matcher on
       
   995 
       
   996 217
       
   997 00:10:16,745 --> 00:10:19,580
       
   998 some examples, more
       
   999 power examples,
       
  1000 
       
  1001 218
       
  1002 00:10:19,580 --> 00:10:22,400
       
  1003 and they should be
       
  1004 very easy to solve.
       
  1005 
       
  1006 219
       
  1007 00:10:22,400 --> 00:10:24,605
       
  1008 So I hope it's not
       
  1009 too much asked for
       
  1010 
       
  1011 220
       
  1012 00:10:24,605 --> 00:10:26,615
       
  1013 and I hope you have fun.
       
  1014 
       
  1015 221
       
  1016 00:10:26,615 --> 00:10:29,675
       
  1017 It is not too much ask
       
  1018 because you can,
       
  1019 
       
  1020 222
       
  1021 00:10:29,675 --> 00:10:32,420
       
  1022 I hope it's not too much
       
  1023 because you can start from
       
  1024 
       
  1025 223
       
  1026 00:10:32,420 --> 00:10:35,615
       
  1027 my definitions or
       
  1028 my implementation.
       
  1029 
       
  1030 224
       
  1031 00:10:35,615 --> 00:10:39,425
       
  1032 It's really essentially
       
  1033 mostly is brain work,
       
  1034 
       
  1035 225
       
  1036 00:10:39,425 --> 00:10:42,170
       
  1037 how these nullable and
       
  1038 derivative should work.
       
  1039 
       
  1040 226
       
  1041 00:10:42,170 --> 00:10:44,510
       
  1042 If you're in a
       
  1043 language like Scala,
       
  1044 
       
  1045 227
       
  1046 00:10:44,510 --> 00:10:48,875
       
  1047 the implementation should
       
  1048 then be easy-peasy.
       
  1049 
       
  1050 228
       
  1051 00:10:48,875 --> 00:10:51,365
       
  1052 If you are in a different language
       
  1053 
       
  1054 229
       
  1055 00:10:51,365 --> 00:10:52,610
       
  1056 I assume you also
       
  1057 
       
  1058 230
       
  1059 00:10:52,610 --> 00:10:54,890
       
  1060 dedicated to that
       
  1061 language to start with,
       
  1062 
       
  1063 231
       
  1064 00:10:54,890 --> 00:10:58,475
       
  1065 so it should be also very
       
  1066 easy for you to translate
       
  1067 
       
  1068 232
       
  1069 00:10:58,475 --> 00:11:01,100
       
  1070 my Scala code into whatever
       
  1071 language you want to
       
  1072 
       
  1073 233
       
  1074 00:11:01,100 --> 00:11:04,280
       
  1075 do, first and then
       
  1076 start from there.
       
  1077 
       
  1078 234
       
  1079 00:11:04,280 --> 00:11:06,230
       
  1080 If there are any questions,
       
  1081 
       
  1082 235
       
  1083 00:11:06,230 --> 00:11:08,360
       
  1084 please ask me or the TAs.
       
  1085 
       
  1086 236
       
  1087 00:11:08,360 --> 00:11:10,040
       
  1088 We are trying to be as helpful
       
  1089 
       
  1090 237
       
  1091 00:11:10,040 --> 00:11:12,900
       
  1092 as possible with the coursework.
       
  1093 
       
  1094 238
       
  1095 00:11:13,240 --> 00:11:17,885
       
  1096 Bear in mind, this is the
       
  1097 first step in our compiler.
       
  1098 
       
  1099 239
       
  1100 00:11:17,885 --> 00:11:21,005
       
  1101 The coursework 2 will then
       
  1102 build on top of that.
       
  1103 
       
  1104 240
       
  1105 00:11:21,005 --> 00:11:25,770
       
  1106 So it's better to get this
       
  1107 correct first. Thanks.