regexp: comparison prio/Paper/Paper.thy

equal deleted inserted replaced

-:c5442db6a5cb
+:5faa1b59e870
 In our opinion, there is clearly a need for investigating correct
 algorithms for PIP. A few specifications for PIP exist (in English)
 and also a few high-level descriptions of implementations (e.g.~in
 the textbook \cite[Section 5.6.5]{Vahalia96}), but they help little
-with actual implementations. That this is a problem in practise is
+with actual implementations. That this is a problem in practice is
 proved by an email from Baker, who wrote on 13 July 2009 on the Linux
 Kernel mailing list:
 \begin{quote}
 \it{}``I observed in the kernel code (to my disgust), the Linux PIP
 release a resource. This means we need to include threads that transitively
 wait for a resource being released (in the picture above this means the dependants
 of @{text "th\<^isub>0"} are @{text "th\<^isub>1"} and @{text "th\<^isub>2"}, which wait for resource @{text "cs\<^isub>1"},
 but also @{text "th\<^isub>3"},
 which cannot make any progress unless @{text "th\<^isub>2"} makes progress, which
-in turn needs to wait for @{text "th\<^isub>0"} to finish). If there is a circle in a RAG, then clearly
+in turn needs to wait for @{text "th\<^isub>0"} to finish). If there is a circle of dependencies
+in a RAG, then clearly
 we have a deadlock. Therefore when a thread requests a resource,
 we must ensure that the resulting RAG is not circular.
 Next we introduce the notion of the \emph{current precedence} of a thread @{text th} in a
 state @{text s}. It is defined as
 \noindent
 The first function is a waiting queue function (that is, it takes a
 resource @{text "cs"} and returns the corresponding list of threads
 that lock, respectively wait for, it); the second is a function that
 takes a thread and returns its current precedence (see
-\eqref{cpreced}). We assume the usual getter and setter methods for
+the definition in \eqref{cpreced}). We assume the usual getter and setter methods for
 such records.
 In the initial state, the scheduler starts with all resources unlocked (the corresponding
 function is defined in \eqref{allunlocked}) and the
 current precedence of every thread is initialised with @{term "Prc 0 0"}; that means
 \mbox{@{abbrev initial_cprec}}. Therefore
-we have for the initial state
+we have for the initial shedule state
 \begin{isabelle}\ \ \ \ \ %%%
 \begin{tabular}{@ {}l}
 @{thm (lhs) schs.simps(1)} @{text "\<equiv>"}\\
 \hspace{5mm}@{term "(|wq_fun = all_unlocked, cprec_fun = (\<lambda>_::thread. Prc 0 0)|)"}
 @{thm runing_def}\\
 \end{tabular}
 \end{isabelle}
 \noindent
-In this definition @{term "DUMMY ` DUMMY"} stands for the image of a set under a function.
+In the second definition @{term "DUMMY ` DUMMY"} stands for the image of a set under a function.
 Note that in the initial state, that is where the list of events is empty, the set
 @{term threads} is empty and therefore there is neither a thread ready nor running.
 If there is one or more threads ready, then there can only be \emph{one} thread
 running, namely the one whose current precedence is equal to the maximum of all ready
 threads. We use sets to capture both possibilities.
 @{thm holdents_def}
 \end{isabelle}
 Finally we can define what a \emph{valid state} is in our model of PIP. For
 example we cannot expect to be able to exit a thread, if it was not
-created yet. These validity constraints on states are characterised by the
+created yet. This would cause havoc  in any scheduler.
+These validity constraints on states are characterised by the
 inductive predicate @{term "step"} and @{term vt}. We first give five inference rules
 for @{term step} relating a state and an event that can happen next.
 \begin{center}
 \begin{tabular}{c}
 text {*
 Sha et al.~state their first correctness criterion for PIP in terms
 of the number of low-priority threads \cite[Theorem 3]{Sha90}: if
 there are @{text n} low-priority threads, then a blocked job with
 high priority can only be blocked @{text n} times.
-Their second correctness criterion is stated
+Their second correctness criterion is given
 in terms of the number of critical resources \cite[Theorem 6]{Sha90}: if there are
 @{text m} critical resources, then a blocked job with high priority
 can only be blocked @{text m} times. Both results on their own, strictly speaking, do
 \emph{not} prevent indefinite, or unbounded, Priority Inversion,
 because if a low-priority thread does not give up its critical
 resource (the one the high-priority thread is waiting for), then the
 high-priority thread can never run.  The argument of Sha et al.~is
 that \emph{if} threads release locked resources in a finite amount
 of time, then indefinite Priority Inversion cannot occur---the high-priority
 thread is guaranteed to run eventually. The assumption is that
-programmers always ensure that this is the case.  However, even
+programmers must ensure that threads are programmed in this way.  However, even
-taking this assumption into account, their correctness properties are
+taking this assumption into account, the correctness properties of
-\emph{not} true for their version of PIP. As Yodaiken
+Sha et al.~are
+\emph{not} true for their version of PIP---despide being ``proved''. As Yodaiken
 \cite{Yodaiken02} pointed out: If a low-priority thread possesses
 locks to two resources for which two high-priority threads are
 waiting for, then lowering the priority prematurely after giving up
 only one lock, can cause indefinite Priority Inversion for one of the
 high-priority threads, invalidating their two bounds.
 Even when fixed, their proof idea does not seem to go through for
-us, because of the way we have set up our formal model of PIP.  The
+us, because of the way we have set up our formal model of PIP.  One
 reason is that we allow that critical sections can intersect
-(something Sha et al.~explicitly exclude).  Therefore we have a
+(something Sha et al.~explicitly exclude).  Therefore we have designed a
 different correctness criterion for PIP. The idea behind our
 criterion is as follows: for all states @{text
 s}, we know the corresponding thread @{text th} with the highest
 precedence; we show that in every future state (denoted by @{text
 "s' @ s"}) in which @{text th} is still alive, either @{text th} is
 \end{tabular}
 \end{isabelle}
 \end{quote}
 \noindent
+The locale mechanism of Isabelle helps us to manage conveniently such assumptions~\cite{Haftmann08}.
 Under these assumptions we will prove the following correctness property:
 \begin{theorem}\label{mainthm}
 Given the assumptions about states @{text "s"} and @{text "s' @ s"},
 the thread @{text th} and the events in @{text "s'"},
 highest precedence in the state @{text s}, can only be blocked in
 the state @{text "s' @ s"} by a thread @{text th'} that already
 existed in @{text s} and requested or had a lock on at least
 one resource---that means the thread was not \emph{detached} in @{text s}.
 As we shall see shortly, that means there are only finitely
-many threads that can block @{text th} and then need to run
+many threads that can block @{text th} in this way and then they
-with the same current precedence as @{text th}.
+need to run with the same current precedence as @{text th}.
 Like in the argument by Sha et al.~our
 finite bound does not guarantee absence of indefinite Priority
 Inversion. For this we further have to assume that every thread
-gives up its resources (that means get detached) after a finite amount of time. We found that
+gives up its resources after a finite amount of time. We found that
 this assumption is awkward to formalise in our model. Therefore we
 leave it out and let the programmer assume the responsibility to
 program threads in such a benign manner (in addition to causing no
 circularity in the @{text RAG}). In this detail, we do not
 make any progress in comparison with the work by Sha et al.
 However, we are able to combine their two separate bounds into a
-single theorem.
+single theorem improving their bound.
 In what follows we will describe properties of PIP that allow us to prove
 Theorem~\ref{mainthm} and, when instructive, briefly describe our argument.
 It is relatively easily to see that
 @{thm[mode=IfThen]  finite_threads}
 \end{tabular}
 \end{isabelle}
 \noindent
-whereby the second property is by induction of @{term vt}. The next three
+The second property is by induction of @{term vt}. The next three
 properties are
 \begin{isabelle}\ \ \ \ \ %%%
 \begin{tabular}{@ {}l}
 @{thm[mode=IfThen] waiting_unique[of _ _ "cs\<^isub>1" "cs\<^isub>2"]}\\
 then @{text "th' \<notin> running (s' @ s)"}.
 \end{lemma}
 \noindent
 The point of this lemma is that a thread different from @{text th} (which has the highest
-precedence in @{text s}) not holding any resource cannot be running
+precedence in @{text s}) and not holding any resource, cannot be running
 in the state @{text "s' @ s"}.
 \begin{proof}
 Since thread @{text "th'"} does not hold any resource, no thread can depend on it.
 Therefore its current precedence @{term "cp (s' @ s) th'"} equals its own precedence
 By defintion of @{text "running"}, @{text "th'"} can not be running in state
 @{text "s' @ s"}, as we had to show.\qed
 \end{proof}
 \noindent
-Since @{text "th'"} is not able to run at state @{text "s' @ s"}, it is not able to
+Since @{text "th'"} is not able to run in state @{text "s' @ s"}, it is not able to
 issue a @{text "P"} or @{text "V"} event. Therefore if @{text "s' @ s"} is extended
 one step further, @{text "th'"} still cannot hold any resource. The situation will
 not change in further extensions as long as @{text "th"} holds the highest precedence.
 From this lemma we can deduce Theorem~\ref{mainthm}: that @{text th} can only be
 blocked by a thread @{text th'} that
 held some resource in state @{text s} (that is not @{text "detached"}). And furthermore
 that the current precedence of @{text th'} in state @{text "(s' @ s)"} must be equal to the
 precedence of @{text th} in @{text "s"}.
 We show this theorem by induction on @{text "s'"} using Lemma~\ref{mainlem}.
-This theorem gives a stricter bound on the processes that can block @{text th}:
+This theorem gives a stricter bound on the processes that can block @{text th} than the
+one obtained by Sha et al.~\cite{Sha90}:
 only threads that were alive in state @{text s} and moreover held a resource.
 This means our bound is in terms of both---alive threads in state @{text s}
 and number of critical resources. Finally, the theorem establishes that the blocking threads have the
 current precedence raised to the precedence of @{text th}.
 al.~\cite{Sha90}), we found that the formalisation can even help us
 with efficiently implementing PIP.
 For example Baker complained that calculating the current precedence
 in PIP is quite ``heavy weight'' in Linux (see the Introduction).
-In our model of PIP the current precedence of a thread in a state s
+In our model of PIP the current precedence of a thread in a state @{text s}
 depends on all its dependants---a ``global'' transitive notion,
 which is indeed heavy weight (see Def.~shown in \eqref{cpreced}).
 We can however improve upon this. For this let us define the notion
 of @{term children} of a thread @{text th} in a state @{text s} as
 @{thm children_def2}
 \end{tabular}
 \end{isabelle}
 \noindent
-where a child is a thread that is one ``hop'' away from the tread
+where a child is a thread that is only one ``hop'' away from the tread
 @{text th} in the @{term RAG} (and waiting for @{text th} to release
-a resource). We can prove that
+a resource). We can prove the following lemma.
 \begin{lemma}\label{childrenlem}
 @{text "If"} @{thm (prem 1) cp_rec} @{text "then"}
 \begin{center}
 @{thm (concl) cp_rec}.
 precedence is computed in this more efficient manner, the selection
 of the thread with highest precedence from a set of ready threads is
 a standard scheduling operation implemented in most operating
 systems.
-Of course the main implementation work for PIP involves the
+Of course the main work for implementing PIP involves the
 scheduler and coding how it should react to events.  Below we
 outline how our formalisation guides this implementation for each
 kind of event.\smallskip
 *}
 @{thm[mode=IfThen] eq_cp}
 \end{tabular}
 \end{isabelle}
 \noindent
-This means we do not have recalculate the @{text RAG} and also none of the
+This means in an implementation we do not have recalculate the @{text RAG} and also none of the
 current precedences of the other threads. The current precedence of the created
 thread @{text th} is just its precedence, namely the pair @{term "(prio, length (s::event list))"}.
 \smallskip
 *}
 (*<*)
 \noindent
 The first property is again telling us we do not need to change the @{text RAG}. The second
 however states that only threads that are \emph{not} dependants of @{text th} have their
 current precedence unchanged. For the others we have to recalculate the current
 precedence. To do this we can start from @{term "th"}
-and follow the @{term "depend"}-chains to recompute the @{term "cp"} of every
+and follow the @{term "depend"}-edges to recompute  using Lemma~\ref{childrenlem}
-thread encountered on the way using Lemma~\ref{childrenlem}. Since the @{term "depend"}
+the @{term "cp"} of every
+thread encountered on the way. Since the @{term "depend"}
 is loop free, this procedure will always stop. The following two lemmas show, however,
 that this procedure can actually stop often earlier without having to consider all
 dependants.
 \begin{isabelle}\ \ \ \ \ %%%
 @{text "then"} @{thm (concl) eq_up}.
 \end{tabular}
 \end{isabelle}
 \noindent
-The first states that if the current precedence of @{text th} is unchanged,
+The first lemma states that if the current precedence of @{text th} is unchanged,
 then the procedure can stop immediately (all dependent threads have their @{term cp}-value unchanged).
 The second states that if an intermediate @{term cp}-value does not change, then
 the procedure can also stop, because none of its dependent threads will
 have their current precedence changed.
 \smallskip
 \colorbox{mygrey}{@{term "V th cs"}:} We assume that @{text s'} and
 @{term "s \<equiv> V th cs#s'"} are both valid. We have to consider two
 subcases: one where there is a thread to ``take over'' the released
 resource @{text cs}, and one where there is not. Let us consider them
 in turn. Suppose in state @{text s}, the thread @{text th'} takes over
-resource @{text cs} from thread @{text th}. We can show
+resource @{text cs} from thread @{text th}. We can prove
 \begin{isabelle}\ \ \ \ \ %%%
 @{thm depend_s}
 \end{isabelle}
 \noindent
-which shows how the @{text RAG} needs to be changed. This also suggests
+which shows how the @{text RAG} needs to be changed. The next lemma suggests
 how the current precedences need to be recalculated. For threads that are
 not @{text "th"} and @{text "th'"} nothing needs to be changed, since we
 can show
 \begin{isabelle}\ \ \ \ \ %%%
 (*<*)
 end
 context step_P_cps_e
 begin
 (*>*)
 text {*
 \noindent
 \colorbox{mygrey}{@{term "P th cs"}:} We assume that @{text s'} and
 @{term "s \<equiv> P th cs#s'"} are both valid. We again have to analyse two subcases, namely
 the one where @{text cs} is locked, and where it is not. We treat the second case
 \noindent
 That means we have to add a waiting edge to the @{text RAG}. Furthermore
 the current precedence for all threads that are not dependants of @{text th}
 are unchanged. For the others we need to follow the edges
 in the @{text RAG} and recompute the @{term "cp"}. However, like in the
-@case of {text Set}, this operation can stop often earlier, namely when intermediate
+case of @{text Set}, this operation can stop often earlier, namely when intermediate
 values do not change.
 *}
 (*<*)
 end
 (*>*)
 text {*
 \noindent
-A pleasing result of our formalisation is that the properties in
+As can be seen, a pleasing byproduct of our formalisation is that the properties in
-this section closely inform an implementation of PIP:  Whether the
+this section closely inform an implementation of PIP, namely whether the
 @{text RAG} needs to be reconfigured or current precedences need to
-recalculated for an event is given by a lemma we proved.
+by recalculated for an event. This information is provided by the lemmas we proved.
 *}
 section {* Conclusion *}
 text {*
 The Priority Inheritance Protocol (PIP) is a classic textbook
-algorithm used in real-time operating systems in order to avoid the problem of
+algorithm used in many real-time operating systems in order to avoid the problem of
 Priority Inversion.  Although classic and widely used, PIP does have
 its faults: for example it does not prevent deadlocks in cases where threads
 have circular lock dependencies.
 We had two goals in mind with our formalisation of PIP: One is to
 \cite{Wang09}.
 The second goal of our formalisation is to provide a specification for actually
 implementing PIP. Textbooks, for example \cite[Section 5.6.5]{Vahalia96},
 explain how to use various implementations of PIP and abstractly
-discuss their properties, but surprisingly lack most details for a
+discuss their properties, but surprisingly lack most details important for a
-programmer who wants to implement PIP.  That this is an issue in practice is illustrated by the
+programmer who wants to implement PIP (similarly Sha et al.~\cite{Sha90}).
+That this is an issue in practice is illustrated by the
 email from Baker we cited in the Introduction. We achieved also this
 goal: The formalisation gives the first author enough data to enable
 his undergraduate students to implement PIP (as part of their OS course)
 on top of PINTOS, a small operating system for teaching
 purposes. A byproduct of our formalisation effort is that nearly all
 design choices for the PIP scheduler are backed up with a proved
 lemma. We were also able to establish the property that the choice of
 the next thread which takes over a lock is irrelevant for the correctness
-of PIP. Earlier model checking approaches which verified implementations
+of PIP. Earlier model checking approaches which verified particular implementations
 of PIP \cite{Faria08,Jahier09,Wellings07} cannot
 provide this kind of ``deep understanding'' about the principles behind
 PIP and its correctness.
 PIP is a scheduling algorithm for single-processor systems. We are
 teaching. Priority Inversion certainly occurs also in
 multi-processor systems.  However, the surprising answer, according
 to \cite{Steinberg10}, is that except for one unsatisfactory
 proposal nobody has a good idea for how PIP should be modified to
 work correctly on multi-processor systems. The difficulties become
-clear when considering that locking and releasing a resource always
+clear when considering the fact that locking and releasing a resource always
 requires a small amount of time. If processes work independently,
 then a low priority process can ``steal'' in such an unguarded
-moment a lock for a resource that was supposed allow a high-priority
+moment a lock for a resource that was supposed to allow a high-priority
 process to run next. Thus the problem of Priority Inversion is not
-really prevented. It seems difficult to design a PIP-algorithm with
+really prevented by the classic PIP. It seems difficult to design a PIP-algorithm with
 a meaningful correctness property on a multi-processor systems where
 processes work independently.  We can imagine PIP to be of use in
 situations where processes are \emph{not} independent, but
 coordinated via a master process that distributes work over some
-slave processes. However, a formal investigation of this is beyond
+slave processes. However, a formal investigation of this idea is beyond
 the scope of this paper.  We are not aware of any proofs in this
-area, not even informal ones.
+area, not even informal or flawed ones.
 The most closely related work to ours is the formal verification in
-PVS for Priority Ceiling done by Dutertre \cite{dutertre99b}. His formalisation
+PVS of the Priority Ceiling Protocol done by Dutertre
-consists of 407 lemmas and 2500 lines of ``specification'' (we do not
+\cite{dutertre99b}---another solution to the Priority Inversion
-know whether this includes also code for proofs).  Our formalisation
+problem, which however needs
+static analysis of programs in order to avoid it.
+His formalisation consists of 407 lemmas and 2500 lines of (PVS) code.  Our formalisation
 consists of around 210 lemmas and overall 6950 lines of readable Isabelle/Isar
 code with a few apply-scripts interspersed. The formal model of PIP
 is 385 lines long; the formal correctness proof 3800 lines. Some auxiliary
-definitions and proofs took 770 lines of code. The properties relevant
+definitions and proofs span over 770 lines of code. The properties relevant
-for an implementation took 2000 lines. The code of our formalisation
+for an implementation require 2000 lines. The code of our formalisation
-can be downloaded from\\
+can be downloaded from
 \url{http://www.dcs.kcl.ac.uk/staff/urbanc/pip.html}.
 \bibliographystyle{plain}
 \bibliography{root}
 *}

changeset 332	5faa1b59e870
parent 331	c5442db6a5cb
child 333	813e7257c7c3