updated
authorChristian Urban <christian.urban@kcl.ac.uk>
Tue, 06 Oct 2020 13:32:00 +0100
changeset 776 939c10745a3a
parent 775 16aafc7691d8
child 777 a10430cb797c
updated
hws/hw01.pdf
hws/hw01.tex
slides/slides03.pdf
slides/slides03.tex
videos/02-proof.srt
Binary file hws/hw01.pdf has changed
--- a/hws/hw01.tex	Tue Oct 06 00:39:34 2020 +0100
+++ b/hws/hw01.tex	Tue Oct 06 13:32:00 2020 +0100
@@ -72,7 +72,7 @@
       \{[a], [b], [c], []\}$.
 
 \item (1) How many basic regular expressions are there to match
-      the string $abcd$? (2) How many if they cannot include
+      \textbf{only} the string $abcd$? (2) How many if they cannot include
       $\ONE$ and $\ZERO$? (3) How many if they are also not
       allowed to contain stars? (4) How many if they are also
       not allowed to contain $\_ + \_$?
Binary file slides/slides03.pdf has changed
--- a/slides/slides03.tex	Tue Oct 06 00:39:34 2020 +0100
+++ b/slides/slides03.tex	Tue Oct 06 13:32:00 2020 +0100
@@ -55,17 +55,17 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%     
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[c]
-\frametitle{Scala Book, Exams}
+%\begin{frame}[c]
+%\frametitle{Scala Book, Exams}
+%
+%\begin{itemize}
+%\item \small https://nms.kcl.ac.uk/christian.urban/ProgInScala2ed.pdf\normalsize
+%\item homework (written exam 80\%)
+%\item coursework (20\%)\bigskip
+%\item short survey at KEATS; to be answered until Sunday    
+%\end{itemize}
 
-\begin{itemize}
-\item \small https://nms.kcl.ac.uk/christian.urban/ProgInScala2ed.pdf\normalsize
-\item homework (written exam 80\%)
-\item coursework (20\%)\bigskip
-\item short survey at KEATS; to be answered until Sunday    
-\end{itemize}
-
-\end{frame}
+%\end{frame}
 %%%%%%%%%%%
 
 
@@ -92,7 +92,7 @@
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%   
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\begin{frame}[c]
+\begin{frame}[t]
 \frametitle{Negation}
 
 Assume you have an alphabet consisting of the letters \bl{$a$}, \bl{$b$} and \bl{$c$} only.
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/videos/02-proof.srt	Tue Oct 06 13:32:00 2020 +0100
@@ -0,0 +1,2110 @@
+1
+00:00:06,260 --> 00:00:10,275
+I come back, I know
+the topic of proofs
+
+2
+00:00:10,275 --> 00:00:13,950
+is usually not the most
+favorite topics with students.
+
+3
+00:00:13,950 --> 00:00:17,220
+But I'm really passionate
+about proofs in
+
+4
+00:00:17,220 --> 00:00:19,320
+this module because they really
+
+5
+00:00:19,320 --> 00:00:21,885
+help you with understanding
+what's going on.
+
+6
+00:00:21,885 --> 00:00:25,170
+And very often they help
+you with preventing errors.
+
+7
+00:00:25,170 --> 00:00:28,695
+You seen me earlier doing
+these calculations.
+
+8
+00:00:28,695 --> 00:00:30,210
+And if I hadn't had
+
+9
+00:00:30,210 --> 00:00:33,330
+the safety net of
+knowing what I'm doing,
+
+10
+00:00:33,330 --> 00:00:35,130
+I could have easily gone wrong.
+
+11
+00:00:35,130 --> 00:00:36,390
+And that's actually a problem is
+
+12
+00:00:36,390 --> 00:00:38,935
+existing wreck Expression
+Matching engines.
+
+13
+00:00:38,935 --> 00:00:41,330
+Sometimes in corner
+cases as we seen,
+
+14
+00:00:41,330 --> 00:00:43,130
+they produce slow results,
+
+15
+00:00:43,130 --> 00:00:45,455
+which is actually not so bad.
+
+16
+00:00:45,455 --> 00:00:48,020
+Much worse is that
+we also know they
+
+17
+00:00:48,020 --> 00:00:50,570
+sometimes produce
+incorrect results.
+
+18
+00:00:50,570 --> 00:00:51,980
+So for us here,
+
+19
+00:00:51,980 --> 00:00:53,390
+the proofs will be
+
+20
+00:00:53,390 --> 00:00:55,100
+essentially the safety nets
+
+21
+00:00:55,100 --> 00:00:58,295
+of making sure that our
+algorithms actually correct.
+
+22
+00:00:58,295 --> 00:01:00,095
+And that's actually
+the beauty of that.
+
+23
+00:01:00,095 --> 00:01:03,170
+It doesn't happen very often
+in real life that we can
+
+24
+00:01:03,170 --> 00:01:06,500
+actually prove the
+correctness of our algorithm.
+
+25
+00:01:06,500 --> 00:01:08,405
+But we can do this here.
+
+26
+00:01:08,405 --> 00:01:11,000
+So we will start
+a bit slow first,
+
+27
+00:01:11,000 --> 00:01:14,330
+I will show you how
+properties are stated.
+
+28
+00:01:14,330 --> 00:01:17,255
+And then along the way
+we will see how proofs
+
+29
+00:01:17,255 --> 00:01:20,940
+actually can be performed
+and how they can help you.
+
+30
+00:01:20,980 --> 00:01:25,205
+You can probably remember this
+slide from the beginning.
+
+31
+00:01:25,205 --> 00:01:29,300
+Now we would probably stayed
+the same property that
+
+32
+00:01:29,300 --> 00:01:32,420
+our algorithm matches
+the specification
+
+33
+00:01:32,420 --> 00:01:33,695
+of what that means.
+
+34
+00:01:33,695 --> 00:01:35,450
+Maybe slightly more mathematical,
+
+35
+00:01:35,450 --> 00:01:37,130
+we would write
+something like this.
+
+36
+00:01:37,130 --> 00:01:39,545
+If our Marcia is given as
+
+37
+00:01:39,545 --> 00:01:42,740
+a string direct expression
+R and says yes,
+
+38
+00:01:42,740 --> 00:01:45,110
+then this string
+really needs to be in
+
+39
+00:01:45,110 --> 00:01:48,350
+the language of R. And
+if the matcher says no,
+
+40
+00:01:48,350 --> 00:01:50,120
+then this string cannot be in
+
+41
+00:01:50,120 --> 00:01:52,325
+the language of r. And we spent
+
+42
+00:01:52,325 --> 00:01:54,590
+quite some effort to make precise
+
+43
+00:01:54,590 --> 00:01:57,065
+what the language is
+of a wreck expression.
+
+44
+00:01:57,065 --> 00:01:59,090
+And we spend some effort to
+
+45
+00:01:59,090 --> 00:02:01,505
+make precise what
+our algorithm is.
+
+46
+00:02:01,505 --> 00:02:02,840
+So we could now take
+
+47
+00:02:02,840 --> 00:02:06,875
+this property and we could
+generate some test cases,
+
+48
+00:02:06,875 --> 00:02:09,410
+and that is all fine and good.
+
+49
+00:02:09,410 --> 00:02:13,340
+Testing is a very important
+thing for our software.
+
+50
+00:02:13,340 --> 00:02:16,715
+But we can only do
+this testing so far.
+
+51
+00:02:16,715 --> 00:02:19,385
+Remember in the homework,
+
+52
+00:02:19,385 --> 00:02:21,275
+there was a single string,
+
+53
+00:02:21,275 --> 00:02:23,750
+abcd, and there were
+actually infinitely
+
+54
+00:02:23,750 --> 00:02:27,530
+many wreck expressions which
+can match only that string.
+
+55
+00:02:27,530 --> 00:02:30,470
+So testing something for
+infinitely many things,
+
+56
+00:02:30,470 --> 00:02:32,000
+that's a bit hard.
+
+57
+00:02:32,000 --> 00:02:36,710
+And Summa, as soon as we have
+an expression with a star,
+
+58
+00:02:36,710 --> 00:02:39,289
+this l function returns,
+
+59
+00:02:39,289 --> 00:02:41,360
+in the general case
+an infinite set.
+
+60
+00:02:41,360 --> 00:02:43,940
+And testing whether a
+string is an infinite set,
+
+61
+00:02:43,940 --> 00:02:45,500
+that's also a bit hard.
+
+62
+00:02:45,500 --> 00:02:47,510
+So testing can only do so much.
+
+63
+00:02:47,510 --> 00:02:49,820
+We could generate some
+wreck expressions
+
+64
+00:02:49,820 --> 00:02:51,560
+and we can generate some strings.
+
+65
+00:02:51,560 --> 00:02:53,990
+And we can test
+whether that is true.
+
+66
+00:02:53,990 --> 00:02:57,470
+So B would be still
+in the quandary that
+
+67
+00:02:57,470 --> 00:03:00,950
+maybe there's a corner
+case which doesn't work.
+
+68
+00:03:00,950 --> 00:03:03,110
+So what we want to
+achieve instead
+
+69
+00:03:03,110 --> 00:03:05,615
+is you want to really make sure
+
+70
+00:03:05,615 --> 00:03:07,325
+that this algorithm works for
+
+71
+00:03:07,325 --> 00:03:10,850
+all reg expression
+and for all strings.
+
+72
+00:03:10,850 --> 00:03:16,400
+And that ONE proving will
+help us to establish.
+
+73
+00:03:16,400 --> 00:03:19,040
+Testing cannot do that.
+
+74
+00:03:19,040 --> 00:03:22,310
+Remember, metro
+mainly consisted of
+
+75
+00:03:22,310 --> 00:03:25,444
+these functions,
+nullable and derivative.
+
+76
+00:03:25,444 --> 00:03:27,755
+So if you want to prove
+anything about Metro.
+
+77
+00:03:27,755 --> 00:03:30,200
+Bet on, make sure that
+we know what actually
+
+78
+00:03:30,200 --> 00:03:33,890
+the specification of
+nullable and derivative is.
+
+79
+00:03:33,890 --> 00:03:35,750
+That's actually quite
+instructive to also
+
+80
+00:03:35,750 --> 00:03:37,850
+understand how they work.
+
+81
+00:03:37,850 --> 00:03:39,695
+So let's do that next.
+
+82
+00:03:39,695 --> 00:03:41,975
+The central property of nullable,
+
+83
+00:03:41,975 --> 00:03:43,445
+Actually that's not so hard.
+
+84
+00:03:43,445 --> 00:03:48,740
+It's just says a function or
+a reg expression is nullable
+
+85
+00:03:48,740 --> 00:03:50,360
+if and only if
+
+86
+00:03:50,360 --> 00:03:52,160
+the empty string is in
+
+87
+00:03:52,160 --> 00:03:54,410
+the language that was the
+purpose of this novel,
+
+88
+00:03:54,410 --> 00:03:55,670
+it takes work, expression and
+
+89
+00:03:55,670 --> 00:03:58,490
+test whether can match
+the empty string.
+
+90
+00:03:58,490 --> 00:04:01,010
+Meaning this empty string needs
+
+91
+00:04:01,010 --> 00:04:03,230
+to be in the language
+of R. And again,
+
+92
+00:04:03,230 --> 00:04:04,385
+we have this if an only,
+
+93
+00:04:04,385 --> 00:04:05,885
+if this one says yes,
+
+94
+00:04:05,885 --> 00:04:08,465
+then the empty string needs
+to be in this language.
+
+95
+00:04:08,465 --> 00:04:10,385
+And if this nullable says no,
+
+96
+00:04:10,385 --> 00:04:13,085
+the empty string CoMP
+unless language.
+
+97
+00:04:13,085 --> 00:04:15,290
+So that's relatively easy.
+
+98
+00:04:15,290 --> 00:04:18,110
+The next one might be also
+very instructive to look
+
+99
+00:04:18,110 --> 00:04:22,775
+at what's the property the
+derivative should satisfy.
+
+100
+00:04:22,775 --> 00:04:26,329
+And remember, the derivative
+was a kind of operation.
+
+101
+00:04:26,329 --> 00:04:28,850
+We taking a wrecker
+expression and we're looking
+
+102
+00:04:28,850 --> 00:04:31,955
+at all the strings this
+reg expression can match,
+
+103
+00:04:31,955 --> 00:04:35,510
+but which starts with
+C. And then me somehow
+
+104
+00:04:35,510 --> 00:04:37,250
+looking for reg expression
+which can match
+
+105
+00:04:37,250 --> 00:04:39,335
+the same string starting with C,
+
+106
+00:04:39,335 --> 00:04:42,440
+except that this
+c is chopped off.
+
+107
+00:04:42,440 --> 00:04:47,150
+And that actually saves
+what this property states.
+
+108
+00:04:47,150 --> 00:04:50,300
+Take the, take a
+break expression are,
+
+109
+00:04:50,300 --> 00:04:53,045
+and build the derivative
+according to see,
+
+110
+00:04:53,045 --> 00:04:56,150
+okay, so this gives
+us a reg expression,
+
+111
+00:04:56,150 --> 00:04:58,025
+the derivative of that
+
+112
+00:04:58,025 --> 00:05:01,775
+r. Now we take the
+a language of that,
+
+113
+00:05:01,775 --> 00:05:06,620
+are that supposed to be
+the same set as if I take
+
+114
+00:05:06,620 --> 00:05:12,845
+the language of R and then
+build the semantic derivative.
+
+115
+00:05:12,845 --> 00:05:16,055
+Remember the semantic derivative
+was doing exactly that.
+
+116
+00:05:16,055 --> 00:05:18,380
+It goes through all
+the strings in this
+
+117
+00:05:18,380 --> 00:05:21,350
+that filters out everything which
+
+118
+00:05:21,350 --> 00:05:24,770
+doesn't start with C.
+So throws them away of
+
+119
+00:05:24,770 --> 00:05:26,510
+the remaining strings
+which start with
+
+120
+00:05:26,510 --> 00:05:28,670
+C. It chops off so that C,
+
+121
+00:05:28,670 --> 00:05:31,370
+So this will be exactly
+
+122
+00:05:31,370 --> 00:05:34,205
+what our derivative
+is supposed to do.
+
+123
+00:05:34,205 --> 00:05:36,605
+And yes, if you can
+establish that,
+
+124
+00:05:36,605 --> 00:05:38,450
+then the language of
+
+125
+00:05:38,450 --> 00:05:43,355
+this derivative operation
+will produce the same sat.
+
+126
+00:05:43,355 --> 00:05:45,815
+This actually also
+how I'm going to
+
+127
+00:05:45,815 --> 00:05:49,760
+test in cases icons
+yet at the beginning,
+
+128
+00:05:49,760 --> 00:05:52,910
+how your coursework is correct.
+
+129
+00:05:52,910 --> 00:05:55,625
+If you give me a
+definition for derivative
+
+130
+00:05:55,625 --> 00:05:58,730
+of one of these bounded
+record expressions,
+
+131
+00:05:58,730 --> 00:06:01,085
+and I can't see
+whether it's correct.
+
+132
+00:06:01,085 --> 00:06:03,380
+Then what it will do
+is I will just try to
+
+133
+00:06:03,380 --> 00:06:05,675
+establish that this
+property holds.
+
+134
+00:06:05,675 --> 00:06:07,265
+And if I can establish that,
+
+135
+00:06:07,265 --> 00:06:08,870
+then you coursework is find.
+
+136
+00:06:08,870 --> 00:06:10,550
+And if I can't establish that,
+
+137
+00:06:10,550 --> 00:06:12,410
+then usually it shows
+
+138
+00:06:12,410 --> 00:06:14,360
+there is a counterexample
+somewhere.
+
+139
+00:06:14,360 --> 00:06:19,220
+Because this derivative,
+as you've seen desk times,
+
+140
+00:06:19,220 --> 00:06:21,725
+more than one solution
+you can give.
+
+141
+00:06:21,725 --> 00:06:23,915
+Some of them are more
+efficient than others.
+
+142
+00:06:23,915 --> 00:06:25,685
+But in terms of correctness,
+
+143
+00:06:25,685 --> 00:06:30,680
+there are multiple solutions
+for the coursework.
+
+144
+00:06:30,680 --> 00:06:34,145
+And in the past I really had
+to scratch my head and say,
+
+145
+00:06:34,145 --> 00:06:36,995
+Is that definition
+really correct?
+
+146
+00:06:36,995 --> 00:06:39,620
+And so what I did is
+I just went ahead and
+
+147
+00:06:39,620 --> 00:06:42,785
+tried to prove that or
+establish this property.
+
+148
+00:06:42,785 --> 00:06:45,110
+The reason why we writing
+
+149
+00:06:45,110 --> 00:06:47,300
+these properties
+down so precisely,
+
+150
+00:06:47,300 --> 00:06:49,880
+remember we know here
+what the derivative is,
+
+151
+00:06:49,880 --> 00:06:51,290
+we know what the
+languages we know
+
+152
+00:06:51,290 --> 00:06:53,495
+what this semantic derivative is.
+
+153
+00:06:53,495 --> 00:06:56,960
+This is usually when you are
+trying to prove something.
+
+154
+00:06:56,960 --> 00:06:58,715
+And very important step.
+
+155
+00:06:58,715 --> 00:07:01,385
+If you know what you're
+actually trying to prove,
+
+156
+00:07:01,385 --> 00:07:05,510
+then you have already
+done half of the proof.
+
+157
+00:07:05,510 --> 00:07:08,300
+Remember, we want to
+establish these properties
+
+158
+00:07:08,300 --> 00:07:10,460
+not just for some reg expression,
+
+159
+00:07:10,460 --> 00:07:13,940
+but for all wrecker expression
+and for all characters.
+
+160
+00:07:13,940 --> 00:07:17,225
+So proving is the only
+method to achieve that.
+
+161
+00:07:17,225 --> 00:07:18,590
+And I said at the beginning of
+
+162
+00:07:18,590 --> 00:07:21,170
+this video's proving is
+
+163
+00:07:21,170 --> 00:07:23,810
+not the most favorite
+subject of students.
+
+164
+00:07:23,810 --> 00:07:25,730
+But I can promise you,
+
+165
+00:07:25,730 --> 00:07:29,630
+if you just follow a very
+simple minded recipe,
+
+166
+00:07:29,630 --> 00:07:33,695
+that brings, you're already
+very far in these proofs.
+
+167
+00:07:33,695 --> 00:07:35,720
+Of course, there's
+always some creativity
+
+168
+00:07:35,720 --> 00:07:37,070
+involved in these proofs.
+
+169
+00:07:37,070 --> 00:07:39,380
+But a simple-minded
+recipe you can
+
+170
+00:07:39,380 --> 00:07:42,620
+follow brings you're already
+very close to the end.
+
+171
+00:07:42,620 --> 00:07:46,610
+So let's have a look how
+this recipe can be derived.
+
+172
+00:07:46,610 --> 00:07:48,740
+Establishing a property is for
+
+173
+00:07:48,740 --> 00:07:50,900
+all Rekha expression nearly
+
+174
+00:07:50,900 --> 00:07:55,805
+always requires an induction
+over wreck expressions.
+
+175
+00:07:55,805 --> 00:07:57,350
+And so we have to look at what's
+
+176
+00:07:57,350 --> 00:07:59,150
+the recipe for doing inductions
+
+177
+00:07:59,150 --> 00:08:02,390
+over wreck expressions
+and inductions.
+
+178
+00:08:02,390 --> 00:08:03,590
+You probably remember from
+
+179
+00:08:03,590 --> 00:08:07,100
+school time you did induction
+over natural numbers.
+
+180
+00:08:07,100 --> 00:08:11,120
+You had to establish that
+some property holds for 0.
+
+181
+00:08:11,120 --> 00:08:13,775
+And you had to establish it
+
+182
+00:08:13,775 --> 00:08:14,960
+that if you assume that
+
+183
+00:08:14,960 --> 00:08:16,970
+the property holds
+for the N case,
+
+184
+00:08:16,970 --> 00:08:21,365
+it has to also hold for
+the n plus one case.
+
+185
+00:08:21,365 --> 00:08:24,350
+And induction over wreck
+expressions is very,
+
+186
+00:08:24,350 --> 00:08:26,810
+very similar to this
+kind of induction.
+
+187
+00:08:26,810 --> 00:08:30,754
+Just that in case of
+break expressions,
+
+188
+00:08:30,754 --> 00:08:33,200
+we have essentially
+three base cases.
+
+189
+00:08:33,200 --> 00:08:35,360
+So in our grandma
+or we say it rec,
+
+190
+00:08:35,360 --> 00:08:38,705
+expressions are 01
+and characters.
+
+191
+00:08:38,705 --> 00:08:42,260
+And these will be the base
+cases in our induction.
+
+192
+00:08:42,260 --> 00:08:43,895
+So we have to establish
+
+193
+00:08:43,895 --> 00:08:46,475
+any property we want
+to prove by induction,
+
+194
+00:08:46,475 --> 00:08:49,055
+you have to show
+that it holds for 0,
+
+195
+00:08:49,055 --> 00:08:50,495
+it holds for one,
+
+196
+00:08:50,495 --> 00:08:52,505
+and it holds for the character.
+
+197
+00:08:52,505 --> 00:08:57,350
+So there's nothing else we
+can assume in these cases.
+
+198
+00:08:57,350 --> 00:08:59,300
+We just have to prove
+that this property
+
+199
+00:08:59,300 --> 00:09:01,835
+holds in these three base cases.
+
+200
+00:09:01,835 --> 00:09:04,490
+It is very much the
+same as if you proved
+
+201
+00:09:04,490 --> 00:09:06,050
+something over natural numbers
+
+202
+00:09:06,050 --> 00:09:08,270
+and you had to prove it for 0.
+
+203
+00:09:08,270 --> 00:09:10,880
+Now, the sequence case,
+
+204
+00:09:10,880 --> 00:09:12,755
+the alternative in a star.
+
+205
+00:09:12,755 --> 00:09:15,815
+These are the kind
+of inductive cases.
+
+206
+00:09:15,815 --> 00:09:17,750
+Remember in natural numbers,
+
+207
+00:09:17,750 --> 00:09:20,450
+you could assume that
+a property holds for
+
+208
+00:09:20,450 --> 00:09:24,770
+n and then you had to
+establish it for n plus one.
+
+209
+00:09:24,770 --> 00:09:26,945
+This is here very simmer.
+
+210
+00:09:26,945 --> 00:09:29,000
+You can assume that
+
+211
+00:09:29,000 --> 00:09:30,920
+these properties hold for
+
+212
+00:09:30,920 --> 00:09:33,140
+the components of
+these reg expression.
+
+213
+00:09:33,140 --> 00:09:35,840
+So you can assume that it
+holds for this on one,
+
+214
+00:09:35,840 --> 00:09:38,960
+and you can assume that
+it holds for this R2.
+
+215
+00:09:38,960 --> 00:09:41,090
+And your task is now to
+
+216
+00:09:41,090 --> 00:09:44,240
+establish this property
+also for sequences.
+
+217
+00:09:44,240 --> 00:09:46,255
+The same for alternative soup.
+
+218
+00:09:46,255 --> 00:09:48,710
+Assume that it holds
+already for R1.
+
+219
+00:09:48,710 --> 00:09:51,005
+You can assume that
+it holds for the R2.
+
+220
+00:09:51,005 --> 00:09:53,930
+You have to establish that
+this property holds also for
+
+221
+00:09:53,930 --> 00:09:58,460
+the entire alternative,
+the same four star.
+
+222
+00:09:58,460 --> 00:10:00,440
+So there you can assume it for r,
+
+223
+00:10:00,440 --> 00:10:02,330
+but you have to now
+establish it that
+
+224
+00:10:02,330 --> 00:10:04,760
+it also holds for r star.
+
+225
+00:10:04,760 --> 00:10:07,910
+Ok, that leads to this
+simple-minded recipe
+
+226
+00:10:07,910 --> 00:10:09,560
+which you can always follow.
+
+227
+00:10:09,560 --> 00:10:10,970
+So first you have to know
+
+228
+00:10:10,970 --> 00:10:13,085
+what's the property
+you want to show.
+
+229
+00:10:13,085 --> 00:10:16,790
+Luckily for you, that's
+almost always given by me.
+
+230
+00:10:16,790 --> 00:10:19,100
+So there's nothing
+to think about it.
+
+231
+00:10:19,100 --> 00:10:22,100
+But then what you have to
+do in an induction proof,
+
+232
+00:10:22,100 --> 00:10:23,855
+you have to show that
+this property holds
+
+233
+00:10:23,855 --> 00:10:26,345
+for 041 and characters.
+
+234
+00:10:26,345 --> 00:10:28,790
+These are the base
+cases in the induction.
+
+235
+00:10:28,790 --> 00:10:31,280
+And then in the induction cases,
+
+236
+00:10:31,280 --> 00:10:32,555
+you just have to say, well,
+
+237
+00:10:32,555 --> 00:10:34,535
+I know that this
+property holds for one,
+
+238
+00:10:34,535 --> 00:10:36,980
+I know that the
+property holds for R2.
+
+239
+00:10:36,980 --> 00:10:38,720
+What does it look like that
+
+240
+00:10:38,720 --> 00:10:40,895
+this property holds
+for the alternative,
+
+241
+00:10:40,895 --> 00:10:44,480
+for the sequence
+and for the r-star,
+
+242
+00:10:44,480 --> 00:10:46,955
+that will then requires
+some reasoning.
+
+243
+00:10:46,955 --> 00:10:48,230
+But if you know what you can
+
+244
+00:10:48,230 --> 00:10:50,359
+assume and what do you
+have to establish?
+
+245
+00:10:50,359 --> 00:10:52,940
+Often can already see bought
+
+246
+00:10:52,940 --> 00:10:55,490
+needs to be proved and
+what's the argument like?
+
+247
+00:10:55,490 --> 00:10:58,325
+So let's try that and
+then concrete example.
+
+248
+00:10:58,325 --> 00:11:01,265
+So how does approve
+work in real life?
+
+249
+00:11:01,265 --> 00:11:03,530
+And say that the beginning
+first you have to know
+
+250
+00:11:03,530 --> 00:11:05,630
+what's the property
+you want to prove.
+
+251
+00:11:05,630 --> 00:11:10,280
+So in our case, the P of R is if
+
+252
+00:11:10,280 --> 00:11:14,900
+nullable of R holds if
+
+253
+00:11:14,900 --> 00:11:19,760
+and only if the empty string
+is in the language of.
+
+254
+00:11:19,760 --> 00:11:23,690
+So we want to do an induction
+of rec expressions.
+
+255
+00:11:23,690 --> 00:11:28,400
+So we better have Fan property
+which depends on this
+
+256
+00:11:28,400 --> 00:11:31,130
+R. And so now in
+
+257
+00:11:31,130 --> 00:11:34,985
+each case I would start with
+a blank piece of paper.
+
+258
+00:11:34,985 --> 00:11:38,030
+And I would write on the top,
+
+259
+00:11:38,030 --> 00:11:40,190
+what are my assumptions?
+
+260
+00:11:40,190 --> 00:11:42,470
+And on the bottom, I would write
+
+261
+00:11:42,470 --> 00:11:45,485
+down what I need to prove.
+
+262
+00:11:45,485 --> 00:11:49,280
+And then I would try
+to somehow navigate
+
+263
+00:11:49,280 --> 00:11:53,540
+my way from these assumptions
+to what I want to prove,
+
+264
+00:11:53,540 --> 00:11:57,740
+to somehow infer
+small steps so that I
+
+265
+00:11:57,740 --> 00:11:59,960
+can assure that what I'm trying
+
+266
+00:11:59,960 --> 00:12:02,780
+to prove holds for
+my assumptions.
+
+267
+00:12:02,780 --> 00:12:06,905
+Obviously. That depends
+on how hard that case is.
+
+268
+00:12:06,905 --> 00:12:08,930
+And if you look at
+our first case,
+
+269
+00:12:08,930 --> 00:12:10,835
+what we have to show is if to
+
+270
+00:12:10,835 --> 00:12:13,655
+prove that this P of 0 holds.
+
+271
+00:12:13,655 --> 00:12:15,410
+And actually no assumption in
+
+272
+00:12:15,410 --> 00:12:17,450
+this case because it's
+one of the base cases.
+
+273
+00:12:17,450 --> 00:12:23,015
+So what I have to show is I
+have to show that nullable of
+
+274
+00:12:23,015 --> 00:12:29,570
+0 if and only if the empty
+string is in L of 0.
+
+275
+00:12:29,570 --> 00:12:31,340
+Okay? And I have to
+
+276
+00:12:31,340 --> 00:12:34,190
+somehow show that this
+implementation of nullable in
+
+277
+00:12:34,190 --> 00:12:35,600
+this case produces
+
+278
+00:12:35,600 --> 00:12:39,185
+exactly the same answer
+as that specification.
+
+279
+00:12:39,185 --> 00:12:41,660
+Okay? So what is nullable of 0 is
+
+280
+00:12:41,660 --> 00:12:45,605
+defined where debt is
+just false by definition.
+
+281
+00:12:45,605 --> 00:12:50,225
+And what is this
+language of LCA defined?
+
+282
+00:12:50,225 --> 00:12:52,535
+Well, that is just
+the empty string.
+
+283
+00:12:52,535 --> 00:12:55,010
+In the empty language.
+
+284
+00:12:55,010 --> 00:12:57,110
+Is the empty string and
+the empty language,
+
+285
+00:12:57,110 --> 00:12:59,240
+no, that is false too.
+
+286
+00:12:59,240 --> 00:13:01,610
+So I have shown that
+
+287
+00:13:01,610 --> 00:13:03,890
+both of them produce
+the same result.
+
+288
+00:13:03,890 --> 00:13:08,765
+So I would already be done
+in the first base case.
+
+289
+00:13:08,765 --> 00:13:12,125
+The second base case is not
+so much more difficult.
+
+290
+00:13:12,125 --> 00:13:15,455
+We want to prove
+that P of one holds.
+
+291
+00:13:15,455 --> 00:13:21,680
+And that would say
+if nullable of one,
+
+292
+00:13:21,680 --> 00:13:23,825
+if and only if
+
+293
+00:13:23,825 --> 00:13:27,665
+the empty string is in
+the language of one.
+
+294
+00:13:27,665 --> 00:13:31,235
+Okay? I can again look
+at what's that defined,
+
+295
+00:13:31,235 --> 00:13:33,275
+that is defined as true.
+
+296
+00:13:33,275 --> 00:13:35,450
+And what is this language to
+
+297
+00:13:35,450 --> 00:13:37,640
+find where that is defined as
+
+298
+00:13:37,640 --> 00:13:39,320
+the empty string in
+
+299
+00:13:39,320 --> 00:13:42,230
+the language containing
+the empty string.
+
+300
+00:13:42,230 --> 00:13:44,150
+And is the empty string in
+
+301
+00:13:44,150 --> 00:13:46,055
+the language containing
+the empty string?
+
+302
+00:13:46,055 --> 00:13:47,975
+Yes, that's true as well.
+
+303
+00:13:47,975 --> 00:13:50,735
+So again, I've shown that
+the actual implementation
+
+304
+00:13:50,735 --> 00:13:55,250
+produces the same result
+as the specification.
+
+305
+00:13:55,250 --> 00:13:57,635
+I leave the character base case
+
+306
+00:13:57,635 --> 00:14:00,690
+for you to fill in the details.
+
+307
+00:14:01,150 --> 00:14:03,320
+More interesting are these
+
+308
+00:14:03,320 --> 00:14:05,300
+inductive cases as
+you can imagine.
+
+309
+00:14:05,300 --> 00:14:08,360
+So let's try to prove
+it for the alternative.
+
+310
+00:14:08,360 --> 00:14:10,639
+So I want to show a property
+
+311
+00:14:10,639 --> 00:14:13,790
+for this record expression way.
+
+312
+00:14:13,790 --> 00:14:15,410
+What does our property save
+
+313
+00:14:15,410 --> 00:14:16,970
+for that reg expression
+where we have to
+
+314
+00:14:16,970 --> 00:14:20,915
+say that nullable of
+
+315
+00:14:20,915 --> 00:14:25,490
+R1 plus R2 if and only if
+
+316
+00:14:25,490 --> 00:14:31,519
+the empty string is in the
+language if R1 plus R2.
+
+317
+00:14:31,519 --> 00:14:35,090
+Ok? So I would now go ahead with
+
+318
+00:14:35,090 --> 00:14:39,695
+my plank piece of paper and
+I would write down on top.
+
+319
+00:14:39,695 --> 00:14:41,475
+Now what can I assume?
+
+320
+00:14:41,475 --> 00:14:45,170
+Well, I can assume
+that P of R1 holds and
+
+321
+00:14:45,170 --> 00:14:49,175
+I can assume what
+p of r two holds.
+
+322
+00:14:49,175 --> 00:14:50,750
+And let's just write that down.
+
+323
+00:14:50,750 --> 00:14:56,210
+We can say that nullable of R1 if
+
+324
+00:14:56,210 --> 00:14:58,655
+and only if the empty string
+
+325
+00:14:58,655 --> 00:15:02,000
+is in the language of
+R1, I can assume that.
+
+326
+00:15:02,000 --> 00:15:03,500
+And I can assume
+
+327
+00:15:03,500 --> 00:15:10,355
+that none above our two if
+and only if the empty string.
+
+328
+00:15:10,355 --> 00:15:14,675
+Or to Hawaii. What's now,
+
+329
+00:15:14,675 --> 00:15:15,830
+what do I want to prove?
+
+330
+00:15:15,830 --> 00:15:18,455
+Where I want to
+prove this property?
+
+331
+00:15:18,455 --> 00:15:21,935
+So let me write this here
+again on the bottom.
+
+332
+00:15:21,935 --> 00:15:30,935
+So p of R1 plus R2
+is narrow form of R1
+
+333
+00:15:30,935 --> 00:15:34,595
+plus R2 if and only if
+
+334
+00:15:34,595 --> 00:15:40,610
+the empty string is in the
+language of R1 plus R2.
+
+335
+00:15:40,610 --> 00:15:43,280
+Well, the first thing I can do is
+
+336
+00:15:43,280 --> 00:15:46,760
+I can infer what's the
+definition of that?
+
+337
+00:15:46,760 --> 00:15:51,920
+Where that would
+be nullable of R1
+
+338
+00:15:51,920 --> 00:15:57,530
+or nullable of our two.
+
+339
+00:15:57,530 --> 00:15:59,960
+That's how it was defined, okay?
+
+340
+00:15:59,960 --> 00:16:01,730
+And what can I do with this?
+
+341
+00:16:01,730 --> 00:16:04,700
+I can also look at how is
+
+342
+00:16:04,700 --> 00:16:08,510
+this language to find
+where that is defined as
+
+343
+00:16:08,510 --> 00:16:16,730
+the empty string in the
+language of R1 union R2.
+
+344
+00:16:16,730 --> 00:16:19,910
+Okay? So what does that mean?
+
+345
+00:16:19,910 --> 00:16:23,585
+Well, Venice, the empty
+string in a union.
+
+346
+00:16:23,585 --> 00:16:27,095
+Well, it has to be
+the case that either
+
+347
+00:16:27,095 --> 00:16:31,190
+the empty string
+is in L of R one,
+
+348
+00:16:31,190 --> 00:16:37,670
+all the empty string
+is o. L of R two.
+
+349
+00:16:37,670 --> 00:16:43,715
+Ok, so that's, I just
+applied all the definitions.
+
+350
+00:16:43,715 --> 00:16:47,930
+And now I have to look better
+my induction hypothesis.
+
+351
+00:16:47,930 --> 00:16:50,930
+These somehow help.
+
+352
+00:16:50,930 --> 00:16:55,820
+Well, I know nullable
+of R1 holds if
+
+353
+00:16:55,820 --> 00:17:00,815
+and only if the empty
+string is in L of one.
+
+354
+00:17:00,815 --> 00:17:05,330
+So that is by the induction
+hypothesis, a hypothesis.
+
+355
+00:17:05,330 --> 00:17:07,670
+And I also know by this,
+
+356
+00:17:07,670 --> 00:17:10,040
+that this holds only if
+
+357
+00:17:10,040 --> 00:17:17,465
+the empty string is in R2
+and there's an in-between.
+
+358
+00:17:17,465 --> 00:17:19,580
+Now you can see again,
+
+359
+00:17:19,580 --> 00:17:23,690
+this is what my specification
+say It must hold.
+
+360
+00:17:23,690 --> 00:17:25,640
+And this is essentially
+
+361
+00:17:25,640 --> 00:17:28,685
+what I could infer from
+my implementation.
+
+362
+00:17:28,685 --> 00:17:30,140
+And they are the same.
+
+363
+00:17:30,140 --> 00:17:32,270
+So also this case,
+
+364
+00:17:32,270 --> 00:17:35,520
+I've shown my property holds.
+
+365
+00:17:35,740 --> 00:17:39,275
+Just for fun. Let's do this
+also for the sequence.
+
+366
+00:17:39,275 --> 00:17:43,835
+So we want to show
+P of 014 by two.
+
+367
+00:17:43,835 --> 00:17:51,050
+And that means I have to
+prove that nullable of R1 by
+
+368
+00:17:51,050 --> 00:17:54,095
+R2 if and only if
+
+369
+00:17:54,095 --> 00:18:00,440
+the empty string is
+one followed by two.
+
+370
+00:18:00,440 --> 00:18:04,880
+Ok? And in this case
+I'm just writing
+
+371
+00:18:04,880 --> 00:18:09,695
+my induction
+hypothesis on the top.
+
+372
+00:18:09,695 --> 00:18:11,060
+And that would be,
+
+373
+00:18:11,060 --> 00:18:13,460
+I know that p of r one holds.
+
+374
+00:18:13,460 --> 00:18:18,290
+So another ball of R one
+
+375
+00:18:18,290 --> 00:18:24,860
+if and only if the empty
+string is in L one.
+
+376
+00:18:24,860 --> 00:18:29,525
+And the P of R to say
+is that I know already
+
+377
+00:18:29,525 --> 00:18:36,815
+nullable of R2 if and only
+if the empty string then R2.
+
+378
+00:18:36,815 --> 00:18:39,230
+Okay? And I know some hard to
+
+379
+00:18:39,230 --> 00:18:42,285
+infer that under
+these assumptions,
+
+380
+00:18:42,285 --> 00:18:45,970
+this property, like
+in the earlier case,
+
+381
+00:18:45,970 --> 00:18:48,070
+I just have a look how
+
+382
+00:18:48,070 --> 00:18:51,145
+everything is defined and
+see if I can go from there.
+
+383
+00:18:51,145 --> 00:18:56,485
+So nullable of the
+sequence would be nullable
+
+384
+00:18:56,485 --> 00:19:04,555
+of R1 and nullable of R2.
+
+385
+00:19:04,555 --> 00:19:08,559
+Okay? By the
+induction hypothesis,
+
+386
+00:19:08,559 --> 00:19:16,255
+I can now infer that
+the empty string is
+
+387
+00:19:16,255 --> 00:19:24,750
+in L one and the empty
+string is in L two.
+
+388
+00:19:24,750 --> 00:19:28,175
+Ok? Not much as I
+can do on this side.
+
+389
+00:19:28,175 --> 00:19:31,520
+But I can now try
+to infer something
+
+390
+00:19:31,520 --> 00:19:33,110
+here because I know how
+
+391
+00:19:33,110 --> 00:19:35,735
+this L of this
+sequence is defined.
+
+392
+00:19:35,735 --> 00:19:39,665
+That says the empty
+string has to be n l,
+
+393
+00:19:39,665 --> 00:19:45,110
+one append of L of R, two.
+
+394
+00:19:45,110 --> 00:19:48,365
+Ok? So they are not yet equal.
+
+395
+00:19:48,365 --> 00:19:50,150
+But now I have to
+essentially scratch
+
+396
+00:19:50,150 --> 00:19:52,430
+my head. What does that mean?
+
+397
+00:19:52,430 --> 00:19:54,965
+So this is the language
+concatenation,
+
+398
+00:19:54,965 --> 00:19:57,035
+Venice, the empty string.
+
+399
+00:19:57,035 --> 00:20:01,265
+In this concatenation of
+these two language as well.
+
+400
+00:20:01,265 --> 00:20:05,990
+It will be there if
+L of our wormholes
+
+401
+00:20:05,990 --> 00:20:12,560
+and The empty string
+of L R2 holds.
+
+402
+00:20:12,560 --> 00:20:14,600
+And so again, I have shown that
+
+403
+00:20:14,600 --> 00:20:17,780
+this is equivalent to that.
+
+404
+00:20:17,780 --> 00:20:20,480
+Which means that this must be
+
+405
+00:20:20,480 --> 00:20:24,540
+equal to that o holds
+if and only if.
+
+406
+00:20:25,600 --> 00:20:28,940
+Finally, the star case,
+
+407
+00:20:28,940 --> 00:20:31,235
+p of r star.
+
+408
+00:20:31,235 --> 00:20:35,450
+Ok? So I know my
+induction hypothesis,
+
+409
+00:20:35,450 --> 00:20:41,075
+let me write it again on
+the top that P of R holds,
+
+410
+00:20:41,075 --> 00:20:46,835
+which means nullable of R if and
+
+411
+00:20:46,835 --> 00:20:53,060
+only if the empty
+string is in L. Okay?
+
+412
+00:20:53,060 --> 00:20:58,580
+And I have to prove
+that nullable of
+
+413
+00:20:58,580 --> 00:21:08,930
+r star if and only if the
+empty string is n l of r star.
+
+414
+00:21:08,930 --> 00:21:15,020
+Ok? So again, I just follow
+how are things define?
+
+415
+00:21:15,020 --> 00:21:18,125
+So this is defined
+actually as true.
+
+416
+00:21:18,125 --> 00:21:21,365
+So a star's always nullable.
+
+417
+00:21:21,365 --> 00:21:23,090
+What is this defined where,
+
+418
+00:21:23,090 --> 00:21:26,225
+remember that wants to find
+using the Kleene star.
+
+419
+00:21:26,225 --> 00:21:28,685
+So that is essentially L of
+
+420
+00:21:28,685 --> 00:21:33,680
+R. And then we had this
+star on languages.
+
+421
+00:21:33,680 --> 00:21:36,260
+And does the star on
+
+422
+00:21:36,260 --> 00:21:39,380
+languages always contain
+the empty string?
+
+423
+00:21:39,380 --> 00:21:42,980
+Yes, that is true as well.
+
+424
+00:21:42,980 --> 00:21:45,320
+And so again, I've
+shown their clone.
+
+425
+00:21:45,320 --> 00:21:47,420
+And here you can see, even if I
+
+426
+00:21:47,420 --> 00:21:49,730
+have here in this case
+an induction hypothesis,
+
+427
+00:21:49,730 --> 00:21:51,860
+I actually didn't need
+that in my reasoning.
+
+428
+00:21:51,860 --> 00:21:53,960
+Well, that might happen.
+
+429
+00:21:53,960 --> 00:21:56,840
+This might have been the
+first induction proof
+
+430
+00:21:56,840 --> 00:21:58,445
+for regular expressions for you.
+
+431
+00:21:58,445 --> 00:22:00,680
+But you've seen at
+Watson so difficult,
+
+432
+00:22:00,680 --> 00:22:02,480
+you just follow this recipe.
+
+433
+00:22:02,480 --> 00:22:04,280
+It told us we can
+
+434
+00:22:04,280 --> 00:22:06,725
+assume in each case
+what we have to prove.
+
+435
+00:22:06,725 --> 00:22:09,860
+And then we essentially just
+followed what are things to
+
+436
+00:22:09,860 --> 00:22:13,595
+find and try to infer
+whether in each case,
+
+437
+00:22:13,595 --> 00:22:15,560
+both of them produce
+the same result,
+
+438
+00:22:15,560 --> 00:22:18,245
+the specification and
+the implementation.
+
+439
+00:22:18,245 --> 00:22:20,300
+And the great thing about this,
+
+440
+00:22:20,300 --> 00:22:22,250
+if you've done all these cases,
+
+441
+00:22:22,250 --> 00:22:25,490
+we not just know that it holds
+for some reg expression,
+
+442
+00:22:25,490 --> 00:22:28,280
+but actually it holds for
+all wreck expressions be
+
+443
+00:22:28,280 --> 00:22:30,890
+couldn't achieve the same
+result by looking just
+
+444
+00:22:30,890 --> 00:22:33,710
+at testing because
+we would not be
+
+445
+00:22:33,710 --> 00:22:37,280
+able to test for all
+record expressions.
+
+446
+00:22:37,280 --> 00:22:39,560
+Ok, of course, there will be
+
+447
+00:22:39,560 --> 00:22:43,010
+also induction proofs which
+are more complicated.
+
+448
+00:22:43,010 --> 00:22:46,170
+But let's leave that
+for another time.
+
+449
+00:22:49,210 --> 00:22:53,405
+You know, I always like to
+talk about the subject.
+
+450
+00:22:53,405 --> 00:22:55,700
+So here's one last thought.
+
+451
+00:22:55,700 --> 00:22:58,040
+You might think these proofs for
+
+452
+00:22:58,040 --> 00:22:59,930
+this little property
+about nullable are
+
+453
+00:22:59,930 --> 00:23:04,295
+totally overkill and
+total waste of your time.
+
+454
+00:23:04,295 --> 00:23:07,310
+Please rest assured disproves
+
+455
+00:23:07,310 --> 00:23:10,160
+have saved me from so
+much embarrassment.
+
+456
+00:23:10,160 --> 00:23:13,745
+You cannot imagine,
+once I have a proof,
+
+457
+00:23:13,745 --> 00:23:15,425
+I can sleep much better.
+
+458
+00:23:15,425 --> 00:23:16,820
+If I don't have a proof,
+
+459
+00:23:16,820 --> 00:23:19,560
+I don't believe what I'm doing.