(*<*)
theory Paper
imports "../Implementation"
"../Correctness"
"~~/src/HOL/Library/LaTeXsugar"
begin
declare [[show_question_marks = false]]
notation (latex output)
Cons ("_::_" [78,77] 73) and
If ("(\<^raw:\textrm{>if\<^raw:}> (_)/ \<^raw:\textrm{>then\<^raw:}> (_)/ \<^raw:\textrm{>else\<^raw:}> (_))" 10) and
vt ("valid'_state") and
runing ("running") and
Prc ("'(_, _')") and
holding ("holds") and
waiting ("waits") and
Th ("T") and
Cs ("C") and
readys ("ready") and
preced ("prec") and
(* preceds ("precs") and*)
cpreced ("cprec") and
cp ("cprec") and
holdents ("resources") and
DUMMY ("\<^raw:\mbox{$\_\!\_$}>") and
cntP ("c\<^bsub>P\<^esub>") and
cntV ("c\<^bsub>V\<^esub>")
(*>*)
section {* Introduction *}
text {*
Many real-time systems need to support threads involving priorities
and locking of resources. Locking of resources ensures mutual
exclusion when accessing shared data or devices that cannot be
preempted. Priorities allow scheduling of threads that need to
finish their work within deadlines. Unfortunately, both features
can interact in subtle ways leading to a problem, called
\emph{Priority Inversion}. Suppose three threads having priorities
$H$(igh), $M$(edium) and $L$(ow). We would expect that the thread
$H$ blocks any other thread with lower priority and the thread
itself cannot be blocked indefinitely by threads with lower
priority. Alas, in a naive implementation of resource locking and
priorities this property can be violated. For this let $L$ be in the
possession of a lock for a resource that $H$ also needs. $H$ must
therefore wait for $L$ to exit the critical section and release this
lock. The problem is that $L$ might in turn be blocked by any thread
with priority $M$, and so $H$ sits there potentially waiting
indefinitely. Since $H$ is blocked by threads with lower priorities,
the problem is called Priority Inversion. It was first described in
\cite{Lampson80} in the context of the Mesa programming language
designed for concurrent programming.
If the problem of Priority Inversion is ignored, real-time systems
can become unpredictable and resulting bugs can be hard to diagnose.
The classic example where this happened is the software that
controlled the Mars Pathfinder mission in 1997 \cite{Reeves98}. On
Earth the software run mostly without any problem, but once the
spacecraft landed on Mars, it shut down at irregular, but frequent,
intervals leading to loss of project time as normal operation of the
craft could only resume the next day (the mission and data already
collected were fortunately not lost, because of a clever system
design). The reason for the shutdowns was that the scheduling
software fell victim to Priority Inversion: a low priority thread
locking a resource prevented a high priority thread from running in
time, leading to a system reset. Once the problem was found, it was
rectified by enabling the \emph{Priority Inheritance Protocol} (PIP)
\cite{Sha90}\footnote{Sha et al.~call it the \emph{Basic Priority
Inheritance Protocol} \cite{Sha90} and others sometimes also call it
\emph{Priority Boosting}, \emph{Priority Donation} or \emph{Priority
Lending}.} in the scheduling software.
The idea behind PIP is to let the thread $L$ temporarily inherit the
high priority from $H$ until $L$ leaves the critical section
unlocking the resource. This solves the problem of $H$ having to
wait indefinitely, because $L$ cannot be blocked by threads having
priority $M$. While a few other solutions exist for the Priority
Inversion problem, PIP is one that is widely deployed and
implemented. This includes VxWorks (a proprietary real-time OS used
in the Mars Pathfinder mission, in Boeing's 787 Dreamliner, Honda's
ASIMO robot, etc.) and ThreadX (another proprietary real-time OS
used in nearly all HP inkjet printers \cite{ThreadX}), but also the
POSIX 1003.1c Standard realised for example in libraries for
FreeBSD, Solaris and Linux.
Two advantages of PIP are that it is deterministic and that
increasing the priority of a thread can be performed dynamically by
the scheduler. This is in contrast to \emph{Priority Ceiling}
\cite{Sha90}, another solution to the Priority Inversion problem,
which requires static analysis of the program in order to prevent
Priority Inversion, and also in contrast to the approach taken in
the Windows NT scheduler, which avoids this problem by randomly
boosting the priority of ready low-priority threads (see for
instance~\cite{WINDOWSNT}). However, there has also been strong
criticism against PIP. For instance, PIP cannot prevent deadlocks
when lock dependencies are circular, and also blocking times can be
substantial (more than just the duration of a critical section).
Though, most criticism against PIP centres around unreliable
implementations and PIP being too complicated and too inefficient.
For example, Yodaiken writes in \cite{Yodaiken02}:
\begin{quote}
\it{}``Priority inheritance is neither efficient nor reliable. Implementations
are either incomplete (and unreliable) or surprisingly complex and intrusive.''
\end{quote}
\noindent He suggests avoiding PIP altogether by designing the
system so that no priority inversion may happen in the first
place. However, such ideal designs may not always be achievable in
practice.
In our opinion, there is clearly a need for investigating correct
algorithms for PIP. A few specifications for PIP exist (in informal
English) and also a few high-level descriptions of implementations
(e.g.~in the textbooks \cite[Section 12.3.1]{Liu00} and
\cite[Section 5.6.5]{Vahalia96}), but they help little with actual
implementations. That this is a problem in practice is proved by an
email by Baker, who wrote on 13 July 2009 on the Linux Kernel
mailing list:
\begin{quote}
\it{}``I observed in the kernel code (to my disgust), the Linux PIP
implementation is a nightmare: extremely heavy weight, involving
maintenance of a full wait-for graph, and requiring updates for a
range of events, including priority changes and interruptions of
wait operations.''
\end{quote}
\noindent The criticism by Yodaiken, Baker and others suggests
another look at PIP from a more abstract level (but still concrete
enough to inform an implementation), and makes PIP a good candidate
for a formal verification. An additional reason is that the original
specification of PIP~\cite{Sha90}, despite being informally
``proved'' correct, is actually \emph{flawed}.
Yodaiken \cite{Yodaiken02} and also Moylan et
al.~\cite{deinheritance} point to a subtlety that had been
overlooked in the informal proof by Sha et al. They specify PIP in
\cite{Sha90} so that after the thread (whose priority has been
raised) completes its critical section and releases the lock, it
``{\it returns to its original priority level}''. This leads them to
believe that an implementation of PIP is ``{\it rather
straightforward}''~\cite{Sha90}. Unfortunately, as Yodaiken and
Moylan et al.~point out, this behaviour is too simplistic. Moylan et
al.~write that there are ``{\it some hidden
traps}''~\cite{deinheritance}. Consider the case where the low
priority thread $L$ locks \emph{two} resources, and two
high-priority threads $H$ and $H'$ each wait for one of them. If
$L$ releases one resource so that $H$, say, can proceed, then we
still have Priority Inversion with $H'$ (which waits for the other
resource). The correct behaviour for $L$ is to switch to the highest
remaining priority of the threads that it blocks. A similar error
is made in the textbook \cite[Section 2.3.1]{book} which specifies
for a process that inherited a higher priority and exits a critical
section ``{\it it resumes the priority it had at the point of entry
into the critical section}''. This error can also be found in the
more recent textbook \cite[Page 119]{Laplante11} where the authors
state: ``{\it when [the task] exits the critical section that caused
the block, it reverts to the priority it had when it entered that
section}''. The textbook \cite[Page 286]{Liu00} contains a simlar
flawed specification and even goes on to develop pseudo-code based
on this flawed specification. Accordingly, the operating system
primitives for inheritance and restoration of priorities in
\cite{Liu00} depend on maintaining a data structure called
\emph{inheritance log}. This log is maintained for every thread and
broadly specified as containing ``{\it [h]istorical information on
how the thread inherited its current priority}'' \cite[Page
527]{Liu00}. Unfortunately, the important information about actually
computing the priority to be restored solely from this log is not
explained in \cite{Liu00} but left as an ``{\it excercise}'' to the
reader. Of course, a correct version of PIP does not need to
maintain this (potentially expensive) data structure at
all. Surprisingly also the widely read and frequently updated
textbook \cite{Silberschatz13} gives the wrong specification. For
example on Page 254 the authors write: ``{\it Upon releasing the
lock, the [low-priority] thread will revert to its original
priority.}'' The same error is also repeated later in this textbook.
While \cite{Laplante11,Liu00,book,Sha90,Silberschatz13} are the only
formal publications we have found that specify the incorrect
behaviour, it seems also many informal descriptions of PIP overlook
the possibility that another high-priority might wait for a
low-priority process to finish. A notable exception is the texbook
\cite{buttazzo}, which gives the correct behaviour of resetting the
priority of a thread to the highest remaining priority of the
threads it blocks. This textbook also gives an informal proof for
the correctness of PIP in the style of Sha et al. Unfortunately,
this informal proof is too vague to be useful for formalising the
correctness of PIP and the specification leaves out nearly all
details in order to implement PIP efficiently.\medskip\smallskip
%
%The advantage of formalising the
%correctness of a high-level specification of PIP in a theorem prover
%is that such issues clearly show up and cannot be overlooked as in
%informal reasoning (since we have to analyse all possible behaviours
%of threads, i.e.~\emph{traces}, that could possibly happen).
\noindent {\bf Contributions:} There have been earlier formal
investigations into PIP \cite{Faria08,Jahier09,Wellings07}, but they
employ model checking techniques. This paper presents a formalised
and mechanically checked proof for the correctness of PIP. For this
we needed to design a new correctness criterion for PIP. In contrast
to model checking, our formalisation provides insight into why PIP
is correct and allows us to prove stronger properties that, as we
will show, can help with an efficient implementation of PIP. We
illustrate this with an implementation of PIP in the educational
PINTOS operating system \cite{PINTOS}. For example, we found by
``playing'' with the formalisation that the choice of the next
thread to take over a lock when a resource is released is irrelevant
for PIP being correct---a fact that has not been mentioned in the
literature and not been used in the reference implementation of PIP
in PINTOS. This fact, however, is important for an efficient
implementation of PIP, because we can give the lock to the thread
with the highest priority so that it terminates more quickly. We
were also being able to generalise the scheduler of Sha et
al.~\cite{Sha90} to the practically relevant case where critical
sections can overlap; see Figure~\ref{overlap} \emph{a)} below for
an example of this restriction. In the existing literature there is
no proof and also no proving method that cover this generalised
case.
\begin{figure}
\begin{center}
\begin{tikzpicture}[scale=1]
%%\draw[step=2mm] (0,0) grid (10,2);
\draw [->,line width=0.6mm] (0,0) -- (10,0);
\draw [->,line width=0.6mm] (0,1.5) -- (10,1.5);
\draw [line width=0.6mm, pattern=horizontal lines] (0.8,0) rectangle (4,0.5);
\draw [line width=0.6mm, pattern=north east lines] (3.0,0) rectangle (6,0.5);
\draw [line width=0.6mm, pattern=vertical lines] (5.0,0) rectangle (9,0.5);
\draw [line width=0.6mm, pattern=horizontal lines] (0.6,1.5) rectangle (4.0,2);
\draw [line width=0.6mm, pattern=north east lines] (1.0,1.5) rectangle (3.4,2);
\draw [line width=0.6mm, pattern=vertical lines] (5.0,1.5) rectangle (8.8,2);
\node at (0.8,-0.3) {@{term "P\<^sub>1"}};
\node at (3.0,-0.3) {@{term "P\<^sub>2"}};
\node at (4.0,-0.3) {@{term "V\<^sub>1"}};
\node at (5.0,-0.3) {@{term "P\<^sub>3"}};
\node at (6.0,-0.3) {@{term "V\<^sub>2"}};
\node at (9.0,-0.3) {@{term "V\<^sub>3"}};
\node at (0.6,1.2) {@{term "P\<^sub>1"}};
\node at (1.0,1.2) {@{term "P\<^sub>2"}};
\node at (3.4,1.2) {@{term "V\<^sub>2"}};
\node at (4.0,1.2) {@{term "V\<^sub>1"}};
\node at (5.0,1.2) {@{term "P\<^sub>3"}};
\node at (8.8,1.2) {@{term "V\<^sub>3"}};
\node at (10.3,0) {$t$};
\node at (10.3,1.5) {$t$};
\node at (-0.3,0.2) {$b)$};
\node at (-0.3,1.7) {$a)$};
\end{tikzpicture}\mbox{}\\[-10mm]\mbox{}
\end{center}
\caption{Assume a process is over time locking and unlocking, say, three resources.
The locking requests are labelled @{term "P\<^sub>1"}, @{term "P\<^sub>2"}, and @{term "P\<^sub>3"}
respectively, and the corresponding unlocking operations are labelled
@{term "V\<^sub>1"}, @{term "V\<^sub>2"}, and @{term "V\<^sub>3"}.
Then graph $a)$ shows \emph{properly nested} critical sections as required
by Sha et al.~\cite{Sha90} in their proof---the sections must either be contained within
each other
(the section @{term "P\<^sub>2"}--@{term "V\<^sub>2"} is contained in @{term "P\<^sub>1"}--@{term "V\<^sub>1"}) or
be independent (@{term "P\<^sub>3"}--@{term "V\<^sub>3"} is independent from the other
two). Graph $b)$ shows the general case where
the locking and unlocking of different critical sections can
overlap.\label{overlap}}
\end{figure}
*}
section {* Formal Model of the Priority Inheritance Protocol\label{model} *}
text {*
The Priority Inheritance Protocol, short PIP, is a scheduling
algorithm for a single-processor system.\footnote{We shall come back
later to the case of PIP on multi-processor systems.}
Following good experience in earlier work \cite{Wang09},
our model of PIP is based on Paulson's inductive approach to protocol
verification \cite{Paulson98}. In this approach a \emph{state} of a system is
given by a list of events that happened so far (with new events prepended to the list).
\emph{Events} of PIP fall
into five categories defined as the datatype:
\begin{isabelle}\ \ \ \ \ %%%
\mbox{\begin{tabular}{r@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {\hspace{7mm}}l}
\isacommand{datatype} event
& @{text "="} & @{term "Create thread priority\<iota>"}\\
& @{text "|"} & @{term "Exit thread"} \\
& @{text "|"} & @{term "Set thread priority\<iota>"} & {\rm reset of the priority for} @{text thread}\\
& @{text "|"} & @{term "P thread cs"} & {\rm request of resource} @{text "cs"} {\rm by} @{text "thread"}\\
& @{text "|"} & @{term "V thread cs"} & {\rm release of resource} @{text "cs"} {\rm by} @{text "thread"}
\end{tabular}}
\end{isabelle}
\noindent
whereby threads, priorities and (critical) resources are represented
as natural numbers. The event @{term Set} models the situation that
a thread obtains a new priority given by the programmer or
user (for example via the {\tt nice} utility under UNIX). As in Paulson's work, we
need to define functions that allow us to make some observations
about states. One, called @{term threads}, calculates the set of
``live'' threads that we have seen so far:
\begin{isabelle}\ \ \ \ \ %%%
\mbox{\begin{tabular}{lcl}
@{thm (lhs) threads.simps(1)} & @{text "\<equiv>"} &
@{thm (rhs) threads.simps(1)}\\
@{thm (lhs) threads.simps(2)[where thread="th"]} & @{text "\<equiv>"} &
@{thm (rhs) threads.simps(2)[where thread="th"]}\\
@{thm (lhs) threads.simps(3)[where thread="th"]} & @{text "\<equiv>"} &
@{thm (rhs) threads.simps(3)[where thread="th"]}\\
@{term "threads (DUMMY#s)"} & @{text "\<equiv>"} & @{term "threads s"}\\
\end{tabular}}
\end{isabelle}
\noindent
In this definition @{term "DUMMY # DUMMY"} stands for list-cons and @{term "[]"} for the empty list.
Another function calculates the priority for a thread @{text "th"}, which is
defined as
\begin{isabelle}\ \ \ \ \ %%%
\mbox{\begin{tabular}{lcl}
@{thm (lhs) priority.simps(1)[where thread="th"]} & @{text "\<equiv>"} &
@{thm (rhs) priority.simps(1)[where thread="th"]}\\
@{thm (lhs) priority.simps(2)[where thread="th" and thread'="th'"]} & @{text "\<equiv>"} &
@{thm (rhs) priority.simps(2)[where thread="th" and thread'="th'"]}\\
@{thm (lhs) priority.simps(3)[where thread="th" and thread'="th'"]} & @{text "\<equiv>"} &
@{thm (rhs) priority.simps(3)[where thread="th" and thread'="th'"]}\\
@{term "priority th (DUMMY#s)"} & @{text "\<equiv>"} & @{term "priority th s"}\\
\end{tabular}}
\end{isabelle}
\noindent
In this definition we set @{text 0} as the default priority for
threads that have not (yet) been created. The last function we need
calculates the ``time'', or index, at which time a thread had its
priority last set.
\begin{isabelle}\ \ \ \ \ %%%
\mbox{\begin{tabular}{lcl}
@{thm (lhs) last_set.simps(1)[where thread="th"]} & @{text "\<equiv>"} &
@{thm (rhs) last_set.simps(1)[where thread="th"]}\\
@{thm (lhs) last_set.simps(2)[where thread="th" and thread'="th'"]} & @{text "\<equiv>"} &
@{thm (rhs) last_set.simps(2)[where thread="th" and thread'="th'"]}\\
@{thm (lhs) last_set.simps(3)[where thread="th" and thread'="th'"]} & @{text "\<equiv>"} &
@{thm (rhs) last_set.simps(3)[where thread="th" and thread'="th'"]}\\
@{term "last_set th (DUMMY#s)"} & @{text "\<equiv>"} & @{term "last_set th s"}\\
\end{tabular}}
\end{isabelle}
\noindent
In this definition @{term "length s"} stands for the length of the list
of events @{text s}. Again the default value in this function is @{text 0}
for threads that have not been created yet. A \emph{precedence} of a thread @{text th} in a
state @{text s} is the pair of natural numbers defined as
\begin{isabelle}\ \ \ \ \ %%%
@{thm preced_def[where thread="th"]}
\end{isabelle}
\noindent
We also use the abbreviation
\begin{isabelle}\ \ \ \ \ %%%
???preceds s ths
\end{isabelle}
\noindent
for the set of precedences of threads @{text ths} in state @{text s}.
The point of precedences is to schedule threads not according to priorities (because what should
we do in case two threads have the same priority), but according to precedences.
Precedences allow us to always discriminate between two threads with equal priority by
taking into account the time when the priority was last set. We order precedences so
that threads with the same priority get a higher precedence if their priority has been
set earlier, since for such threads it is more urgent to finish their work. In an implementation
this choice would translate to a quite natural FIFO-scheduling of threads with
the same priority.
Moylan et al.~\cite{deinheritance} considered the alternative of
``time-slicing'' threads with equal priority, but found that it does not lead to
advantages in practice. On the contrary, according to their work having a policy
like our FIFO-scheduling of threads with equal priority reduces the number of
tasks involved in the inheritance process and thus minimises the number
of potentially expensive thread-switches.
We will also need counters for @{term P} and @{term V} events of a thread @{term th}
in a state @{term s}. This can be straightforwardly defined in Isabelle as
\begin{isabelle}\ \ \ \ \ %%%
\mbox{\begin{tabular}{@ {}l}
@{thm cntP_def}\\
@{thm cntV_def}
\end{tabular}}
\end{isabelle}
\noindent using the predefined function @{const count} for lists.
Next, we introduce the concept of \emph{waiting queues}. They are
lists of threads associated with every resource. The first thread in
this list (i.e.~the head, or short @{term hd}) is chosen to be the one
that is in possession of the
``lock'' of the corresponding resource. We model waiting queues as
functions, below abbreviated as @{text wq}. They take a resource as
argument and return a list of threads. This allows us to define
when a thread \emph{holds}, respectively \emph{waits} for, a
resource @{text cs} given a waiting queue function @{text wq}.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm cs_holding_def[where thread="th"]}\\
@{thm cs_waiting_def[where thread="th"]}
\end{tabular}
\end{isabelle}
\noindent
In this definition we assume @{text "set"} converts a list into a set.
Note that in the first definition the condition about @{text "th \<in> set (wq cs)"} does not follow
from @{text "th = hd (set (wq cs))"}, since the head of an empty list is undefined in Isabelle/HOL.
At the beginning, that is in the state where no thread is created yet,
the waiting queue function will be the function that returns the
empty list for every resource.
\begin{isabelle}\ \ \ \ \ %%%
@{abbrev all_unlocked}\hfill\numbered{allunlocked}
\end{isabelle}
\noindent
Using @{term "holding"} and @{term waiting}, we can introduce \emph{Resource Allocation Graphs}
(RAG), which represent the dependencies between threads and resources.
We represent RAGs as relations using pairs of the form
\begin{isabelle}\ \ \ \ \ %%%
@{term "(Th th, Cs cs)"} \hspace{5mm}{\rm and}\hspace{5mm}
@{term "(Cs cs, Th th)"}
\end{isabelle}
\noindent
where the first stands for a \emph{waiting edge} and the second for a
\emph{holding edge} (@{term Cs} and @{term Th} are constructors of a
datatype for vertices). Given a waiting queue function, a RAG is defined
as the union of the sets of waiting and holding edges, namely
\begin{isabelle}\ \ \ \ \ %%%
@{thm cs_RAG_def}
\end{isabelle}
\begin{figure}[t]
\begin{center}
\newcommand{\fnt}{\fontsize{7}{8}\selectfont}
\begin{tikzpicture}[scale=1]
%%\draw[step=2mm] (-3,2) grid (1,-1);
\node (A) at (0,0) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>0"}};
\node (B) at (2,0) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>1"}};
\node (C) at (4,0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>1"}};
\node (D) at (4,-0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>2"}};
\node (E) at (6,-0.7) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>2"}};
\node (E1) at (6, 0.2) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>3"}};
\node (F) at (8,-0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>3"}};
\node (X) at (0,-2) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>4"}};
\node (Y) at (2,-2) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>4"}};
\node (Z) at (2,-2.9) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>5"}};
\node (U1) at (4,-2) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>5"}};
\node (U2) at (4,-2.9) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^sub>6"}};
\node (R) at (6,-2.9) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^sub>6"}};
\draw [<-,line width=0.6mm] (A) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (B);
\draw [->,line width=0.6mm] (C) to node [pos=0.4,sloped,above=-0.5mm] {\fnt{}waiting} (B);
\draw [->,line width=0.6mm] (D) to node [pos=0.4,sloped,below=-0.5mm] {\fnt{}waiting} (B);
\draw [<-,line width=0.6mm] (D) to node [pos=0.54,sloped,below=-0.5mm] {\fnt{}holding} (E);
\draw [<-,line width=0.6mm] (D) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (E1);
\draw [->,line width=0.6mm] (F) to node [pos=0.45,sloped,below=-0.5mm] {\fnt{}waiting} (E);
\draw [->,line width=0.6mm] (U1) to node [pos=0.45,sloped,below=-0.5mm] {\fnt{}waiting} (Y);
\draw [->,line width=0.6mm] (U2) to node [pos=0.45,sloped,below=-0.5mm] {\fnt{}waiting} (Z);
\draw [<-,line width=0.6mm] (X) to node [pos=0.54,sloped,below=-0.5mm] {\fnt{}holding} (Z);
\draw [<-,line width=0.6mm] (X) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (Y);
\draw [<-,line width=0.6mm] (U2) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (R);
\end{tikzpicture}
\end{center}
\caption{An instance of a Resource Allocation Graph (RAG).\label{RAGgraph}}
\end{figure}
\noindent
If there is no cycle, then every RAG can be pictured as a forrest of trees, as
for example in Figure~\ref{RAGgraph}.
We will design our scheduler
so that every thread can be in the possession of several resources, that is
it has potentially several incoming holding edges in the RAG, but has at most one outgoing
waiting edge. The reason is that when a thread asks for resource that is locked
already, then the thread is blocked and cannot ask for another resource.
Clearly, also every resource can only have at most one outgoing holding edge---indicating
that the resource is locked. In this way we can always start at a thread waiting for a
resource and ``chase'' outgoing arrows leading to a single root of a tree.
The use of relations for representing RAGs allows us to conveniently define
the notion of the \emph{dependants} of a thread using the transitive closure
operation for relations, written ~@{term "trancl DUMMY"}. This gives
\begin{isabelle}\ \ \ \ \ %%%
@{thm cs_dependants_def}
\end{isabelle}
\noindent
This definition needs to account for all threads that wait for a thread to
release a resource. This means we need to include threads that transitively
wait for a resource to be released (in the picture above this means the dependants
of @{text "th\<^sub>0"} are @{text "th\<^sub>1"} and @{text "th\<^sub>2"}, which wait for resource @{text "cs\<^sub>1"},
but also @{text "th\<^sub>3"},
which cannot make any progress unless @{text "th\<^sub>2"} makes progress, which
in turn needs to wait for @{text "th\<^sub>0"} to finish). If there is a circle of dependencies
in a RAG, then clearly
we have a deadlock. Therefore when a thread requests a resource,
we must ensure that the resulting RAG is not circular. In practice, the
programmer has to ensure this.
Next we introduce the notion of the \emph{current precedence} of a thread @{text th} in a
state @{text s}. It is defined as
\begin{isabelle}\ \ \ \ \ %%%
@{thm cpreced_def2}\hfill\numbered{cpreced}
\end{isabelle}
\noindent
where the dependants of @{text th} are given by the waiting queue function.
While the precedence @{term prec} of any thread is determined statically
(for example when the thread is
created), the point of the current precedence is to let the scheduler increase this
precedence, if needed according to PIP. Therefore the current precedence of @{text th} is
given as the maximum of the precedence @{text th} has in state @{text s} \emph{and} all
threads that are dependants of @{text th}. Since the notion @{term "dependants"} is
defined as the transitive closure of all dependent threads, we deal correctly with the
problem in the informal algorithm by Sha et al.~\cite{Sha90} where a priority of a thread is
lowered prematurely. We again introduce an abbreviation for current precedeces of
a set of threads, written @{term "cprecs wq s ths"}.
The next function, called @{term schs}, defines the behaviour of the scheduler. It will be defined
by recursion on the state (a list of events); this function returns a \emph{schedule state}, which
we represent as a record consisting of two
functions:
\begin{isabelle}\ \ \ \ \ %%%
@{text "\<lparr>wq_fun, cprec_fun\<rparr>"}
\end{isabelle}
\noindent
The first function is a waiting queue function (that is, it takes a
resource @{text "cs"} and returns the corresponding list of threads
that lock, respectively wait for, it); the second is a function that
takes a thread and returns its current precedence (see
the definition in \eqref{cpreced}). We assume the usual getter and setter methods for
such records.
In the initial state, the scheduler starts with all resources unlocked (the corresponding
function is defined in \eqref{allunlocked}) and the
current precedence of every thread is initialised with @{term "Prc 0 0"}; that means
\mbox{@{abbrev initial_cprec}}. Therefore
we have for the initial shedule state
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm (lhs) schs.simps(1)} @{text "\<equiv>"}\\
\hspace{5mm}@{term "(|wq_fun = all_unlocked, cprec_fun = (\<lambda>_::thread. Prc 0 0)|)"}
\end{tabular}
\end{isabelle}
\noindent
The cases for @{term Create}, @{term Exit} and @{term Set} are also straightforward:
we calculate the waiting queue function of the (previous) state @{text s};
this waiting queue function @{text wq} is unchanged in the next schedule state---because
none of these events lock or release any resource;
for calculating the next @{term "cprec_fun"}, we use @{text wq} and
@{term cpreced}. This gives the following three clauses for @{term schs}:
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm (lhs) schs.simps(2)} @{text "\<equiv>"}\\
\hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\
\hspace{8mm}@{term "(|wq_fun = wq\<iota>, cprec_fun = cpreced wq\<iota> (Create th prio # s)|)"}\smallskip\\
@{thm (lhs) schs.simps(3)} @{text "\<equiv>"}\\
\hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\
\hspace{8mm}@{term "(|wq_fun = wq\<iota>, cprec_fun = cpreced wq\<iota> (Exit th # s)|)"}\smallskip\\
@{thm (lhs) schs.simps(4)} @{text "\<equiv>"}\\
\hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\
\hspace{8mm}@{term "(|wq_fun = wq\<iota>, cprec_fun = cpreced wq\<iota> (Set th prio # s)|)"}
\end{tabular}
\end{isabelle}
\noindent
More interesting are the cases where a resource, say @{text cs}, is locked or released. In these cases
we need to calculate a new waiting queue function. For the event @{term "P th cs"}, we have to update
the function so that the new thread list for @{text cs} is the old thread list plus the thread @{text th}
appended to the end of that list (remember the head of this list is assigned to be in the possession of this
resource). This gives the clause
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm (lhs) schs.simps(5)} @{text "\<equiv>"}\\
\hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\
\hspace{5mm}@{text "let"} @{text "new_wq = wq(cs := (wq cs @ [th]))"} @{text "in"}\\
\hspace{8mm}@{term "(|wq_fun = new_wq, cprec_fun = cpreced new_wq (P th cs # s)|)"}
\end{tabular}
\end{isabelle}
\noindent
The clause for event @{term "V th cs"} is similar, except that we need to update the waiting queue function
so that the thread that possessed the lock is deleted from the corresponding thread list. For this
list transformation, we use
the auxiliary function @{term release}. A simple version of @{term release} would
just delete this thread and return the remaining threads, namely
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}lcl}
@{term "release []"} & @{text "\<equiv>"} & @{term "[]"}\\
@{term "release (DUMMY # qs)"} & @{text "\<equiv>"} & @{term "qs"}\\
\end{tabular}
\end{isabelle}
\noindent
In practice, however, often the thread with the highest precedence in the list will get the
lock next. We have implemented this choice, but later found out that the choice
of which thread is chosen next is actually irrelevant for the correctness of PIP.
Therefore we prove the stronger result where @{term release} is defined as
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}lcl}
@{term "release []"} & @{text "\<equiv>"} & @{term "[]"}\\
@{term "release (DUMMY # qs)"} & @{text "\<equiv>"} & @{term "SOME qs'. distinct qs' \<and> set qs' = set qs"}\\
\end{tabular}
\end{isabelle}
\noindent
where @{text "SOME"} stands for Hilbert's epsilon and implements an arbitrary
choice for the next waiting list. It just has to be a list of distinctive threads and
contains the same elements as @{text "qs"}. This gives for @{term V} the clause:
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm (lhs) schs.simps(6)} @{text "\<equiv>"}\\
\hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\
\hspace{5mm}@{text "let"} @{text "new_wq = wq(cs := release (wq cs))"} @{text "in"}\\
\hspace{8mm}@{term "(|wq_fun = new_wq, cprec_fun = cpreced new_wq (V th cs # s)|)"}
\end{tabular}
\end{isabelle}
Having the scheduler function @{term schs} at our disposal, we can
``lift'', or overload, the notions @{term waiting}, @{term holding},
@{term RAG}, @{term dependants} and @{term cp} to operate on states
only.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}rcl}
@{thm (lhs) s_holding_abv} & @{text "\<equiv>"} & @{thm (rhs) s_holding_abv}\\
@{thm (lhs) s_waiting_abv} & @{text "\<equiv>"} & @{thm (rhs) s_waiting_abv}\\
@{thm (lhs) s_RAG_abv} & @{text "\<equiv>"} & @{thm (rhs) s_RAG_abv}\\
@{thm (lhs) s_dependants_abv}& @{text "\<equiv>"} & @{thm (rhs) s_dependants_abv}\\
@{thm (lhs) cp_def} & @{text "\<equiv>"} & @{thm (rhs) cp_def}\\
\end{tabular}
\end{isabelle}
\noindent
With these abbreviations in place we can introduce
the notion of a thread being @{term ready} in a state (i.e.~threads
that do not wait for any resource, which are the roots of the trees
in the RAG, see Figure~\ref{RAGgraph}). The @{term running} thread
is then the thread with the highest current precedence of all ready threads.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm readys_def}\\
@{thm runing_def}
\end{tabular}
\end{isabelle}
\noindent
%%In the second definition @{term "DUMMY ` DUMMY"} stands for the image of a set under a function.
Note that in the initial state, that is where the list of events is empty, the set
@{term threads} is empty and therefore there is neither a thread ready nor running.
If there is one or more threads ready, then there can only be \emph{one} thread
running, namely the one whose current precedence is equal to the maximum of all ready
threads. We use sets to capture both possibilities.
We can now also conveniently define the set of resources that are locked by a thread in a
given state and also when a thread is detached in a state (meaning the thread neither
holds nor waits for a resource---in the RAG this would correspond to an
isolated node without any incoming and outgoing edges, see Figure~\ref{RAGgraph}):
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm holdents_def}\\
@{thm detached_def}
\end{tabular}
\end{isabelle}
%\noindent
%The second definition states that @{text th} in @{text s}.
Finally we can define what a \emph{valid state} is in our model of PIP. For
example we cannot expect to be able to exit a thread, if it was not
created yet.
These validity constraints on states are characterised by the
inductive predicate @{term "step"} and @{term vt}. We first give five inference rules
for @{term step} relating a state and an event that can happen next.
\begin{center}
\begin{tabular}{c}
@{thm[mode=Rule] thread_create[where thread=th]}\hspace{1cm}
@{thm[mode=Rule] thread_exit[where thread=th]}
\end{tabular}
\end{center}
\noindent
The first rule states that a thread can only be created, if it is not alive yet.
Similarly, the second rule states that a thread can only be terminated if it was
running and does not lock any resources anymore (this simplifies slightly our model;
in practice we would expect the operating system releases all locks held by a
thread that is about to exit). The event @{text Set} can happen
if the corresponding thread is running.
\begin{center}
@{thm[mode=Rule] thread_set[where thread=th]}
\end{center}
\noindent
If a thread wants to lock a resource, then the thread needs to be
running and also we have to make sure that the resource lock does
not lead to a cycle in the RAG. In practice, ensuring the latter
is the responsibility of the programmer. In our formal
model we brush aside these problematic cases in order to be able to make
some meaningful statements about PIP.\footnote{This situation is
similar to the infamous \emph{occurs check} in Prolog: In order to say
anything meaningful about unification, one needs to perform an occurs
check. But in practice the occurs check is omitted and the
responsibility for avoiding problems rests with the programmer.}
\begin{center}
@{thm[mode=Rule] thread_P[where thread=th]}
\end{center}
\noindent
Similarly, if a thread wants to release a lock on a resource, then
it must be running and in the possession of that lock. This is
formally given by the last inference rule of @{term step}.
\begin{center}
@{thm[mode=Rule] thread_V[where thread=th]}
\end{center}
\noindent
Note, however, that apart from the circularity condition, we do not make any
assumption on how different resources can be locked and released relative to each
other. In our model it is possible that critical sections overlap. This is in
contrast to Sha et al \cite{Sha90} who require that critical sections are
properly nested (recall Fig.~\ref{overlap}).
A valid state of PIP can then be conveniently be defined as follows:
\begin{center}
\begin{tabular}{c}
@{thm[mode=Axiom] vt_nil}\hspace{1cm}
@{thm[mode=Rule] vt_cons}
\end{tabular}
\end{center}
\noindent
This completes our formal model of PIP. In the next section we present
properties that show our model of PIP is correct.
*}
section {* The Correctness Proof *}
(*<*)
context extend_highest_gen
begin
(*>*)
text {*
Sha et al.~state their first correctness criterion for PIP in terms
of the number of low-priority threads \cite[Theorem 3]{Sha90}: if
there are @{text n} low-priority threads, then a blocked job with
high priority can only be blocked a maximum of @{text n} times.
Their second correctness criterion is given in terms of the number
of critical resources \cite[Theorem 6]{Sha90}: if there are @{text
m} critical resources, then a blocked job with high priority can
only be blocked a maximum of @{text m} times. Both results on their
own, strictly speaking, do \emph{not} prevent indefinite, or
unbounded, Priority Inversion, because if a low-priority thread does
not give up its critical resource (the one the high-priority thread
is waiting for), then the high-priority thread can never run. The
argument of Sha et al.~is that \emph{if} threads release locked
resources in a finite amount of time, then indefinite Priority
Inversion cannot occur---the high-priority thread is guaranteed to
run eventually. The assumption is that programmers must ensure that
threads are programmed in this way. However, even taking this
assumption into account, the correctness properties of Sha et
al.~are \emph{not} true for their version of PIP---despite being
``proved''. As Yodaiken \cite{Yodaiken02} and Moylan et
al.~\cite{deinheritance} pointed out: If a low-priority thread
possesses locks to two resources for which two high-priority threads
are waiting for, then lowering the priority prematurely after giving
up only one lock, can cause indefinite Priority Inversion for one of
the high-priority threads, invalidating their two bounds.
Even when fixed, their proof idea does not seem to go through for
us, because of the way we have set up our formal model of PIP. One
reason is that we allow critical sections, which start with a @{text
P}-event and finish with a corresponding @{text V}-event, to
arbitrarily overlap (something Sha et al.~explicitly exclude).
Therefore we have designed a different correctness criterion for
PIP. The idea behind our criterion is as follows: for all states
@{text s}, we know the corresponding thread @{text th} with the
highest precedence; we show that in every future state (denoted by
@{text "s' @ s"}) in which @{text th} is still alive, either @{text
th} is running or it is blocked by a thread that was alive in the
state @{text s} and was waiting for or in the possession of a lock
in @{text s}. Since in @{text s}, as in every state, the set of
alive threads is finite, @{text th} can only be blocked a finite
number of times. This is independent of how many threads of lower
priority are created in @{text "s'"}. We will actually prove a
stronger statement where we also provide the current precedence of
the blocking thread.
However, this correctness criterion hinges upon a number of
assumptions about the states @{text s} and @{text "s' @ s"}, the
thread @{text th} and the events happening in @{text s'}. We list
them next:
\begin{quote}
{\bf Assumptions on the states {\boldmath@{text s}} and
{\boldmath@{text "s' @ s"}:}} We need to require that @{text "s"} and
@{text "s' @ s"} are valid states:
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{l}
@{term "vt s"}, @{term "vt (s' @ s)"}
\end{tabular}
\end{isabelle}
\end{quote}
\begin{quote}
{\bf Assumptions on the thread {\boldmath@{text "th"}:}}
The thread @{text th} must be alive in @{text s} and
has the highest precedence of all alive threads in @{text s}. Furthermore the
priority of @{text th} is @{text prio} (we need this in the next assumptions).
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{l}
@{term "th \<in> threads s"}\\
@{term "prec th s = Max (cprecs s (threads s))"}\\
@{term "prec th s = (prio, DUMMY)"}
\end{tabular}
\end{isabelle}
\end{quote}
\begin{quote}
{\bf Assumptions on the events in {\boldmath@{text "s'"}:}} We want to prove that @{text th} cannot
be blocked indefinitely. Of course this can happen if threads with higher priority
than @{text th} are continuously created in @{text s'}. Therefore we have to assume that
events in @{text s'} can only create (respectively set) threads with equal or lower
priority than @{text prio} of @{text th}. We also need to assume that the
priority of @{text "th"} does not get reset and also that @{text th} does
not get ``exited'' in @{text "s'"}. This can be ensured by assuming the following three implications.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{l}
{If}~~@{text "Create th' prio' \<in> set s'"}~~{then}~~@{text "prio' \<le> prio"}\\
{If}~~@{text "Set th' prio' \<in> set s'"}~~{then}~~@{text "th' \<noteq> th"}~~{and}~~@{text "prio' \<le> prio"}\\
{If}~~@{text "Exit th' \<in> set s'"}~~{then}~~@{text "th' \<noteq> th"}\\
\end{tabular}
\end{isabelle}
\end{quote}
\noindent The locale mechanism of Isabelle helps us to manage
conveniently such assumptions~\cite{Haftmann08}. Under these
assumptions we shall prove the following correctness property:
\begin{theorem}\label{mainthm}
Given the assumptions about states @{text "s"} and @{text "s' @ s"},
the thread @{text th} and the events in @{text "s'"}, then either
\begin{itemize}
\item @{term "th \<in> running (s' @ s)"} or\medskip
\item there exists a thread @{term "th'"} with @{term "th' \<noteq> th"}
and @{term "th' \<in> running (s' @ s)"} such that @{text "th' \<in> threads
s"}, @{text "\<not> detached s th'"} and @{term "cp (s' @ s) th' = prec
th s"}.
\end{itemize}
\end{theorem}
\noindent This theorem ensures that the thread @{text th}, which has
the highest precedence in the state @{text s}, is either running in
state @{term "s' @ s"}, or can only be blocked in the state @{text
"s' @ s"} by a thread @{text th'} that already existed in @{text s}
and requested or had a lock on at least one resource---that means
the thread was not \emph{detached} in @{text s}. As we shall see
shortly, that means there are only finitely many threads that can
block @{text th} in this way and then they need to run with the same
precedence as @{text th}.
Like in the argument by Sha et al.~our finite bound does not
guarantee absence of indefinite Priority Inversion. For this we
further have to assume that every thread gives up its resources
after a finite amount of time. We found that this assumption is
awkward to formalise in our model. There are mainly two reasons for
this: First, we do not specify what ``running'' the code of a thread
means, for example by giving an operational semantics for machine
instructions. Therefore we cannot characterise what are ``good''
programs that contain for every looking request also a corresponding
unlocking request for a resource. Second, we would need to specify a
kind of global clock that tracks the time how long a thread locks a
resource. But this seems difficult, because how do we conveniently
distinguish between a thread that ``just'' locks a resource for a
very long time and one that locks it forever. Therefore we decided
to leave out this property and let the programmer assume the
responsibility to program threads in such a benign manner (in
addition to causing no circularity in the RAG). In this detail, we
do not make any progress in comparison with the work by Sha et al.
However, we are able to combine their two separate bounds into a
single theorem improving their bound.
In what follows we will describe properties of PIP that allow us to
prove Theorem~\ref{mainthm} and, when instructive, briefly describe
our argument. Recall we want to prove that in state @{term "s' @ s"}
either @{term th} is either running or blocked by a thread @{term
"th'"} (@{term "th \<noteq> th'"}) which was alive in state @{term s}. We
can show that
\begin{lemma}
If @{thm (prem 2) eq_pv_blocked}
then @{thm (concl) eq_pv_blocked}
\end{lemma}
\begin{lemma}
If @{thm (prem 2) eq_pv_persist}
then @{thm (concl) eq_pv_persist}
\end{lemma}
\subsection*{\bf OUTLINE}
Since @{term "th"} is the most urgent thread, if it is somehow
blocked, people want to know why and wether this blocking is
reasonable.
@{thm [source] th_blockedE} @{thm th_blockedE}
if @{term "th"} is blocked, then there is a path leading from
@{term "th"} to @{term "th'"}, which means:
there is a chain of demand leading from @{term th} to @{term th'}.
%%% in other words
%%% th -> cs1 -> th1 -> cs2 -> th2 -> ... -> csn -> thn -> cs -> th'.
%%%
%%% We says that th is blocked by "th'".
THEN
@{thm [source] vat_t.th_chain_to_ready} @{thm vat_t.th_chain_to_ready}
It is basic propery with non-trival proof.
THEN
@{thm [source] max_preced} @{thm max_preced}
which says @{term "th"} holds the max precedence.
THEN
@{thm [source] th_cp_max th_cp_preced th_kept}
@{thm th_cp_max th_cp_preced th_kept}
THENTHEN
(here) %@ {thm [source] runing_inversion_4} @ {thm runing_inversion_4}
which explains what the @{term "th'"} looks like. Now, we have found the
@{term "th'"} which blocks @{term th}, we need to know more about it.
To see what kind of thread can block @{term th}.
From these two lemmas we can see the correctness of PIP, which is
that: the blockage of th is reasonable and under control.
Lemmas we want to describe:
\begin{lemma}
@{thm runing_cntP_cntV_inv}
\end{lemma}
\noindent
Remember we do not have the well-nestedness restriction in our
proof, which means the difference between the counters @{const cntV}
and @{const cntP} can be larger than @{term 1}.
\begin{lemma}
@{thm runing_inversion}
\end{lemma}
\begin{lemma}
@{thm th_blockedE}
\end{lemma}
\subsection*{END OUTLINE}
In what follows we will describe properties of PIP that allow us to prove
Theorem~\ref{mainthm} and, when instructive, briefly describe our argument.
It is relatively easy to see that:
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{text "running s \<subseteq> ready s \<subseteq> threads s"}\\
@{thm[mode=IfThen] finite_threads}
\end{tabular}
\end{isabelle}
\noindent
The second property is by induction on @{term vt}. The next three
properties are:
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
HERE??
%@ {thm[mode=IfThen] waiting_unique[of _ _ "cs\<^sub>1" "cs\<^sub>2"]}\\
%@ {thm[mode=IfThen] held_unique[of _ "th\<^sub>1" _ "th\<^sub>2"]}\\
%@ {thm[mode=IfThen] runing_unique[of _ "th\<^sub>1" "th\<^sub>2"]}
\end{tabular}
\end{isabelle}
\noindent
The first property states that every waiting thread can only wait for a single
resource (because it gets suspended after requesting that resource); the second
that every resource can only be held by a single thread;
the third property establishes that in every given valid state, there is
at most one running thread. We can also show the following properties
about the @{term RAG} in @{text "s"}.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
HERE?? %@{text If}~@ {thm (prem 1) acyclic_RAG}~@{text "then"}:\\
\hspace{5mm}@{thm (concl) acyclic_RAG},
@{thm (concl) finite_RAG} and
@{thm (concl) wf_dep_converse},\\
%\hspace{5mm}@{text "if"}~@ {thm (prem 2) dm_RAG_threads}~@{text "then"}~@{thm (concl) dm_RAG_threads}
and\\
%\hspace{5mm}@{text "if"}~@ {thm (prem 2) range_in}~@{text "then"}~@{thm (concl) range_in}.
\end{tabular}
\end{isabelle}
\noindent
The acyclicity property follows from how we restricted the events in
@{text step}; similarly the finiteness and well-foundedness property.
The last two properties establish that every thread in a @{text "RAG"}
(either holding or waiting for a resource) is a live thread.
The key lemma in our proof of Theorem~\ref{mainthm} is as follows:
\begin{lemma}\label{mainlem}
Given the assumptions about states @{text "s"} and @{text "s' @ s"},
the thread @{text th} and the events in @{text "s'"},
if @{term "th' \<in> threads (s' @ s)"}, @{text "th' \<noteq> th"} and @{text "detached (s' @ s) th'"}\\
then @{text "th' \<notin> running (s' @ s)"}.
\end{lemma}
\noindent
The point of this lemma is that a thread different from @{text th} (which has the highest
precedence in @{text s}) and not holding any resource, cannot be running
in the state @{text "s' @ s"}.
\begin{proof}
Since thread @{text "th'"} does not hold any resource, no thread can depend on it.
Therefore its current precedence @{term "cp (s' @ s) th'"} equals its own precedence
@{term "prec th' (s' @ s)"}. Since @{text "th"} has the highest precedence in the
state @{text "(s' @ s)"} and precedences are distinct among threads, we have
@{term "prec th' (s' @s ) < prec th (s' @ s)"}. From this
we have @{term "cp (s' @ s) th' < prec th (s' @ s)"}.
Since @{text "prec th (s' @ s)"} is already the highest
@{term "cp (s' @ s) th"} can not be higher than this and can not be lower either (by
definition of @{term "cp"}). Consequently, we have @{term "prec th (s' @ s) = cp (s' @ s) th"}.
Finally we have @{term "cp (s' @ s) th' < cp (s' @ s) th"}.
By defintion of @{text "running"}, @{text "th'"} can not be running in state
@{text "s' @ s"}, as we had to show.\qed
\end{proof}
\noindent
Since @{text "th'"} is not able to run in state @{text "s' @ s"}, it is not able to
issue a @{text "P"} or @{text "V"} event. Therefore if @{text "s' @ s"} is extended
one step further, @{text "th'"} still cannot hold any resource. The situation will
not change in further extensions as long as @{text "th"} holds the highest precedence.
From this lemma we can deduce Theorem~\ref{mainthm}: that @{text th} can only be
blocked by a thread @{text th'} that
held some resource in state @{text s} (that is not @{text "detached"}). And furthermore
that the current precedence of @{text th'} in state @{text "(s' @ s)"} must be equal to the
precedence of @{text th} in @{text "s"}.
We show this theorem by induction on @{text "s'"} using Lemma~\ref{mainlem}.
This theorem gives a stricter bound on the threads that can block @{text th} than the
one obtained by Sha et al.~\cite{Sha90}:
only threads that were alive in state @{text s} and moreover held a resource.
This means our bound is in terms of both---alive threads in state @{text s}
and number of critical resources. Finally, the theorem establishes that the blocking threads have the
current precedence raised to the precedence of @{text th}.
We can furthermore prove that under our assumptions no deadlock exists in the state @{text "s' @ s"}
by showing that @{text "running (s' @ s)"} is not empty.
\begin{lemma}
Given the assumptions about states @{text "s"} and @{text "s' @ s"},
the thread @{text th} and the events in @{text "s'"},
@{term "running (s' @ s) \<noteq> {}"}.
\end{lemma}
\begin{proof}
If @{text th} is blocked, then by following its dependants graph, we can always
reach a ready thread @{text th'}, and that thread must have inherited the
precedence of @{text th}.\qed
\end{proof}
%The following lemmas show how every node in RAG can be chased to ready threads:
%\begin{enumerate}
%\item Every node in RAG can be chased to a ready thread (@{text "chain_building"}):
% @ {thm [display] chain_building[rule_format]}
%\item The ready thread chased to is unique (@{text "dchain_unique"}):
% @ {thm [display] dchain_unique[of _ _ "th\<^sub>1" "th\<^sub>2"]}
%\end{enumerate}
%Some deeper results about the system:
%\begin{enumerate}
%\item The maximum of @{term "cp"} and @{term "preced"} are equal (@{text "max_cp_eq"}):
%@ {thm [display] max_cp_eq}
%\item There must be one ready thread having the max @{term "cp"}-value
%(@{text "max_cp_readys_threads"}):
%@ {thm [display] max_cp_readys_threads}
%\end{enumerate}
%The relationship between the count of @{text "P"} and @{text "V"} and the number of
%critical resources held by a thread is given as follows:
%\begin{enumerate}
%\item The @{term "V"}-operation decreases the number of critical resources
% one thread holds (@{text "cntCS_v_dec"})
% @ {thm [display] cntCS_v_dec}
%\item The number of @{text "V"} never exceeds the number of @{text "P"}
% (@ {text "cnp_cnv_cncs"}):
% @ {thm [display] cnp_cnv_cncs}
%\item The number of @{text "V"} equals the number of @{text "P"} when
% the relevant thread is not living:
% (@{text "cnp_cnv_eq"}):
% @ {thm [display] cnp_cnv_eq}
%\item When a thread is not living, it does not hold any critical resource
% (@{text "not_thread_holdents"}):
% @ {thm [display] not_thread_holdents}
%\item When the number of @{text "P"} equals the number of @{text "V"}, the relevant
% thread does not hold any critical resource, therefore no thread can depend on it
% (@{text "count_eq_dependants"}):
% @ {thm [display] count_eq_dependants}
%\end{enumerate}
%The reason that only threads which already held some resoures
%can be runing and block @{text "th"} is that if , otherwise, one thread
%does not hold any resource, it may never have its prioirty raised
%and will not get a chance to run. This fact is supported by
%lemma @{text "moment_blocked"}:
%@ {thm [display] moment_blocked}
%When instantiating @{text "i"} to @{text "0"}, the lemma means threads which did not hold any
%resource in state @{text "s"} will not have a change to run latter. Rephrased, it means
%any thread which is running after @{text "th"} became the highest must have already held
%some resource at state @{text "s"}.
%When instantiating @{text "i"} to a number larger than @{text "0"}, the lemma means
%if a thread releases all its resources at some moment in @{text "t"}, after that,
%it may never get a change to run. If every thread releases its resource in finite duration,
%then after a while, only thread @{text "th"} is left running. This shows how indefinite
%priority inversion can be avoided.
%All these assumptions are put into a predicate @{term "extend_highest_gen"}.
%It can be proved that @{term "extend_highest_gen"} holds
%for any moment @{text "i"} in it @{term "t"} (@{text "red_moment"}):
%@ {thm [display] red_moment}
%From this, an induction principle can be derived for @{text "t"}, so that
%properties already derived for @{term "t"} can be applied to any prefix
%of @{text "t"} in the proof of new properties
%about @{term "t"} (@{text "ind"}):
%\begin{center}
%@ {thm[display] ind}
%\end{center}
%The following properties can be proved about @{term "th"} in @{term "t"}:
%\begin{enumerate}
%\item In @{term "t"}, thread @{term "th"} is kept live and its
% precedence is preserved as well
% (@{text "th_kept"}):
% @ {thm [display] th_kept}
%\item In @{term "t"}, thread @{term "th"}'s precedence is always the maximum among
% all living threads
% (@{text "max_preced"}):
% @ {thm [display] max_preced}
%\item In @{term "t"}, thread @{term "th"}'s current precedence is always the maximum precedence
% among all living threads
% (@{text "th_cp_max_preced"}):
% @ {thm [display] th_cp_max_preced}
%\item In @{term "t"}, thread @{term "th"}'s current precedence is always the maximum current
% precedence among all living threads
% (@{text "th_cp_max"}):
% @ {thm [display] th_cp_max}
%\item In @{term "t"}, thread @{term "th"}'s current precedence equals its precedence at moment
% @{term "s"}
% (@{text "th_cp_preced"}):
% @ {thm [display] th_cp_preced}
%\end{enumerate}
%The main theorem of this part is to characterizing the running thread during @{term "t"}
%(@{text "runing_inversion_2"}):
%@ {thm [display] runing_inversion_2}
%According to this, if a thread is running, it is either @{term "th"} or was
%already live and held some resource
%at moment @{text "s"} (expressed by: @{text "cntV s th' < cntP s th'"}).
%Since there are only finite many threads live and holding some resource at any moment,
%if every such thread can release all its resources in finite duration, then after finite
%duration, none of them may block @{term "th"} anymore. So, no priority inversion may happen
%then.
NOTE: about bounds in sha et al and ours:
There are low priority threads,
which do not hold any resources,
such thread will not block th.
Their Theorem 3 does not exclude such threads.
There are resources, which are not held by any low prioirty threads,
such resources can not cause blockage of th neither. And similiary,
theorem 6 does not exlude them.
Our one bound excudle them by using a different formaulation. "
*}
(*<*)
end
(*>*)
section {* Properties for an Implementation\label{implement} *}
text {*
While our formalised proof gives us confidence about the correctness of our model of PIP,
we found that the formalisation can even help us with efficiently implementing it.
For example Baker complained that calculating the current precedence
in PIP is quite ``heavy weight'' in Linux (see the Introduction).
In our model of PIP the current precedence of a thread in a state @{text s}
depends on all its dependants---a ``global'' transitive notion,
which is indeed heavy weight (see Definition shown in \eqref{cpreced}).
We can however improve upon this. For this let us define the notion
of @{term children} of a thread @{text th} in a state @{text s} as
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
HERE?? %%@ {thm children_def2}
\end{tabular}
\end{isabelle}
\noindent
where a child is a thread that is only one ``hop'' away from the thread
@{text th} in the @{term RAG} (and waiting for @{text th} to release
a resource). We can prove the following lemma.
\begin{lemma}\label{childrenlem}
HERE %@{text "If"} @ {thm (prem 1) cp_rec} @{text "then"}
\begin{center}
%@ {thm (concl) cp_rec}.
\end{center}
\end{lemma}
\noindent
That means the current precedence of a thread @{text th} can be
computed locally by considering only the current precedences of the children of @{text th}. In
effect, it only needs to be recomputed for @{text th} when one of
its children changes its current precedence. Once the current
precedence is computed in this more efficient manner, the selection
of the thread with highest precedence from a set of ready threads is
a standard scheduling operation implemented in most operating
systems.
%\begin{proof}[of Lemma~\ref{childrenlem}]
%Test
%\end{proof}
Of course the main work for implementing PIP involves the
scheduler and coding how it should react to events. Below we
outline how our formalisation guides this implementation for each
kind of events.\smallskip
*}
(*<*)
context step_create_cps
begin
(*>*)
text {*
\noindent
\colorbox{mygrey}{@{term "Create th prio"}:} We assume that the current state @{text s'} and
the next state @{term "s \<equiv> Create th prio#s'"} are both valid (meaning the event
is allowed to occur). In this situation we can show that
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
HERE ?? %@ {thm eq_dep},\\
@{thm eq_cp_th}, and\\
@{thm[mode=IfThen] eq_cp}
\end{tabular}
\end{isabelle}
\noindent
This means in an implementation we do not have to recalculate the @{text RAG} and also none of the
current precedences of the other threads. The current precedence of the created
thread @{text th} is just its precedence, namely the pair @{term "(prio, length (s::event list))"}.
\smallskip
*}
(*<*)
end
context step_exit_cps
begin
(*>*)
text {*
\noindent
\colorbox{mygrey}{@{term "Exit th"}:} We again assume that the current state @{text s'} and
the next state @{term "s \<equiv> Exit th#s'"} are both valid. We can show that
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
HERE %@ {thm eq_dep}, and\\
@{thm[mode=IfThen] eq_cp}
\end{tabular}
\end{isabelle}
\noindent
This means again we do not have to recalculate the @{text RAG} and
also not the current precedences for the other threads. Since @{term th} is not
alive anymore in state @{term "s"}, there is no need to calculate its
current precedence.
\smallskip
*}
(*<*)
end
context step_set_cps
begin
(*>*)
text {*
\noindent
\colorbox{mygrey}{@{term "Set th prio"}:} We assume that @{text s'} and
@{term "s \<equiv> Set th prio#s'"} are both valid. We can show that
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm[mode=IfThen] eq_dep}, and\\
@{thm[mode=IfThen] eq_cp_pre}
\end{tabular}
\end{isabelle}
\noindent
The first property is again telling us we do not need to change the @{text RAG}.
The second shows that the @{term cp}-values of all threads other than @{text th}
are unchanged. The reason is that @{text th} is running; therefore it is not in
the @{term dependants} relation of any other thread. This in turn means that the
change of its priority cannot affect other threads.
%The second
%however states that only threads that are \emph{not} dependants of @{text th} have their
%current precedence unchanged. For the others we have to recalculate the current
%precedence. To do this we can start from @{term "th"}
%and follow the @{term "depend"}-edges to recompute using Lemma~\ref{childrenlem}
%the @{term "cp"} of every
%thread encountered on the way. Since the @{term "depend"}
%is assumed to be loop free, this procedure will always stop. The following two lemmas show, however,
%that this procedure can actually stop often earlier without having to consider all
%dependants.
%
%\begin{isabelle}\ \ \ \ \ %%%
%\begin{tabular}{@ {}l}
%@ {thm[mode=IfThen] eq_up_self}\\
%@{text "If"} @ {thm (prem 1) eq_up}, @ {thm (prem 2) eq_up} and @ {thm (prem 3) eq_up}\\
%@{text "then"} @ {thm (concl) eq_up}.
%\end{tabular}
%\end{isabelle}
%
%\noindent
%The first lemma states that if the current precedence of @{text th} is unchanged,
%then the procedure can stop immediately (all dependent threads have their @{term cp}-value unchanged).
%The second states that if an intermediate @{term cp}-value does not change, then
%the procedure can also stop, because none of its dependent threads will
%have their current precedence changed.
\smallskip
*}
(*<*)
end
context step_v_cps_nt
begin
(*>*)
text {*
\noindent
\colorbox{mygrey}{@{term "V th cs"}:} We assume that @{text s'} and
@{term "s \<equiv> V th cs#s'"} are both valid. We have to consider two
subcases: one where there is a thread to ``take over'' the released
resource @{text cs}, and one where there is not. Let us consider them
in turn. Suppose in state @{text s}, the thread @{text th'} takes over
resource @{text cs} from thread @{text th}. We can prove
\begin{isabelle}\ \ \ \ \ %%%
@{thm RAG_s}
\end{isabelle}
\noindent
which shows how the @{text RAG} needs to be changed. The next lemma suggests
how the current precedences need to be recalculated. For threads that are
not @{text "th"} and @{text "th'"} nothing needs to be changed, since we
can show
\begin{isabelle}\ \ \ \ \ %%%
@{thm[mode=IfThen] cp_kept}
\end{isabelle}
\noindent
For @{text th} and @{text th'} we need to use Lemma~\ref{childrenlem} to
recalculate their current precedence since their children have changed. *}(*<*)end context step_v_cps_nnt begin (*>*)text {*
\noindent
In the other case where there is no thread that takes over @{text cs}, we can show how
to recalculate the @{text RAG} and also show that no current precedence needs
to be recalculated.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm RAG_s}\\
@{thm eq_cp}
\end{tabular}
\end{isabelle}
*}
(*<*)
end
context step_P_cps_e
begin
(*>*)
text {*
\noindent
\colorbox{mygrey}{@{term "P th cs"}:} We assume that @{text s'} and
@{term "s \<equiv> P th cs#s'"} are both valid. We again have to analyse two subcases, namely
the one where @{text cs} is not locked, and one where it is. We treat the former case
first by showing that
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm RAG_s}\\
HERE %@ {thm eq_cp}
\end{tabular}
\end{isabelle}
\noindent
This means we need to add a holding edge to the @{text RAG} and no
current precedence needs to be recalculated.*}(*<*)end context step_P_cps_ne begin(*>*) text {*
\noindent
In the second case we know that resource @{text cs} is locked. We can show that
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
@{thm RAG_s}\\
HERE %@ {thm[mode=IfThen] eq_cp}
\end{tabular}
\end{isabelle}
\noindent
That means we have to add a waiting edge to the @{text RAG}. Furthermore
the current precedence for all threads that are not dependants of @{text "th'"}
are unchanged. For the others we need to follow the edges
in the @{text RAG} and recompute the @{term "cp"}. To do this we can start from @{term "th"}
and follow the @{term "depend"}-edges to recompute using Lemma~\ref{childrenlem}
the @{term "cp"} of every
thread encountered on the way. Since the @{term "depend"}
is loop free, this procedure will always stop. The following lemma shows, however,
that this procedure can actually stop often earlier without having to consider all
dependants.
\begin{isabelle}\ \ \ \ \ %%%
\begin{tabular}{@ {}l}
%%@ {t hm[mode=IfThen] eq_up_self}\\
HERE
%@{text "If"} @ {thm (prem 1) eq_up}, @ {thm (prem 2) eq_up} and @ {thm (prem 3) eq_up}\\
%@{text "then"} @ {thm (concl) eq_up}.
\end{tabular}
\end{isabelle}
\noindent
This lemma states that if an intermediate @{term cp}-value does not change, then
the procedure can also stop, because none of its dependent threads will
have their current precedence changed.
*}(*<*)end(*>*)text {*
As can be seen, a pleasing byproduct of our formalisation is that the properties in
this section closely inform an implementation of PIP, namely whether the
RAG needs to be reconfigured or current precedences need to
be recalculated for an event. This information is provided by the lemmas we proved.
We confirmed that our observations translate into practice by implementing
our version of PIP on top of PINTOS, a small operating system written in C and used for teaching at
Stanford University \cite{PINTOS}. An alternative would have been the small Xv6 operating
system used for teaching at MIT \cite{Xv6link,Xv6}. However this operating system implements
a simple round robin scheduler that lacks stubs for dealing with priorities. This
is inconvenient for our purposes.
To implement PIP in PINTOS, we only need to modify the kernel
functions corresponding to the events in our formal model. The events translate to the following
function interface in PINTOS:
\begin{center}
\begin{tabular}{|l@ {\hspace{2mm}}|l@ {\hspace{2mm}}|}
\hline
{\bf Event} & {\bf PINTOS function} \\
\hline
@{text Create} & @{ML_text "thread_create"}\\
@{text Exit} & @{ML_text "thread_exit"}\\
@{text Set} & @{ML_text "thread_set_priority"}\\
@{text P} & @{ML_text "lock_acquire"}\\
@{text V} & @{ML_text "lock_release"}\\
\hline
\end{tabular}
\end{center}
\noindent
Our implicit assumption that every event is an atomic operation is ensured by the architecture of
PINTOS (which allows disabling of interrupts when some operations are performed). The case where
an unlocked resource is given next to the waiting thread with the
highest precedence is realised in our implementation by priority queues. We implemented
them as \emph{Braun trees} \cite{Paulson96}, which provide efficient @{text "O(log n)"}-operations
for accessing and updating. In the code we shall describe below, we use the function
@{ML_text "queue_insert"}, for inserting a new element into a priority queue, and
the function @{ML_text "queue_update"}, for updating the position of an element that is already
in a queue. Both functions take an extra argument that specifies the
comparison function used for organising the priority queue.
Apart from having to implement relatively complex data\-structures in C
using pointers, our experience with the implementation has been very positive: our specification
and formalisation of PIP translates smoothly to an efficent implementation in PINTOS.
Let us illustrate this with the C-code for the function @{ML_text "lock_acquire"},
shown in Figure~\ref{code}. This function implements the operation of requesting and, if free,
locking of a resource by the current running thread. The convention in the PINTOS
code is to use the terminology \emph{locks} rather than resources.
A lock is represented as a pointer to the structure {\tt lock} (Line 1).
Lines 2 to 4 are taken from the original
code of @{ML_text "lock_acquire"} in PINTOS. They contain diagnostic code: first,
there is a check that
the lock is a ``valid'' lock
by testing whether it is not {\tt NULL}; second, a check that the code is not called
as part of an interrupt---acquiring a lock should only be initiated by a
request from a (user) thread, not from an interrupt; third, it is ensured that the
current thread does not ask twice for a lock. These assertions are supposed
to be satisfied because of the assumptions in PINTOS about how this code is called.
If not, then the assertions indicate a bug in PINTOS and the result will be
a ``kernel panic''.
\begin{figure}[tph]
\begin{lstlisting}
void lock_acquire (struct lock *lock)
{ ASSERT (lock != NULL);
ASSERT (!intr_context());
ASSERT (!lock_held_by_current_thread (lock));
enum intr_level old_level;
old_level = intr_disable();
if (lock->value == 0) {
queue_insert(thread_cprec, &lock->wq, &thread_current()->helem);
thread_current()->waiting = lock;
struct thread *pt;
pt = lock->holder;
while (pt) {
queue_update(lock_cprec, &pt->held, &lock->helem);
if (!(update_cprec(pt)))
break;
lock = pt->waiting;
if (!lock) {
queue_update(higher_cprec, &ready_queue, &pt->helem);
break;
};
queue_update(thread_cprec, &lock->wq, &pt->helem);
pt = lock->holder;
};
thread_block();
} else {
lock->value--;
lock->holder = thread_current();
queue_insert(lock_prec, &thread_current()->held, &lock->helem);
};
intr_set_level(old_level);
}
\end{lstlisting}
\caption{Our version of the {\tt lock\_acquire} function for the small operating
system PINTOS. It implements the operation corresponding to a @{text P}-event.\label{code}}
\end{figure}
Line 6 and 7 of {\tt lock\_acquire} make the operation of acquiring a lock atomic by disabling all
interrupts, but saving them for resumption at the end of the function (Line 31).
In Line 8, the interesting code with respect to scheduling starts: we
first check whether the lock is already taken (its value is then 0 indicating ``already
taken'', or 1 for being ``free''). In case the lock is taken, we enter the
if-branch inserting the current thread into the waiting queue of this lock (Line 9).
The waiting queue is referenced in the usual C-way as @{ML_text "&lock->wq"}.
Next, we record that the current thread is waiting for the lock (Line 10).
Thus we established two pointers: one in the waiting queue of the lock pointing to the
current thread, and the other from the currend thread pointing to the lock.
According to our specification in Section~\ref{model} and the properties we were able
to prove for @{text P}, we need to ``chase'' all the dependants
in the RAG (Resource Allocation Graph) and update their
current precedence; however we only have to do this as long as there is change in the
current precedence.
The ``chase'' is implemented in the while-loop in Lines 13 to 24.
To initialise the loop, we
assign in Lines 11 and 12 the variable @{ML_text pt} to the owner
of the lock.
Inside the loop, we first update the precedence of the lock held by @{ML_text pt} (Line 14).
Next, we check whether there is a change in the current precedence of @{ML_text pt}. If not,
then we leave the loop, since nothing else needs to be updated (Lines 15 and 16).
If there is a change, then we have to continue our ``chase''. We check what lock the
thread @{ML_text pt} is waiting for (Lines 17 and 18). If there is none, then
the thread @{ML_text pt} is ready (the ``chase'' is finished with finding a root in the RAG). In this
case we update the ready-queue accordingly (Lines 19 and 20). If there is a lock @{ML_text pt} is
waiting for, we update the waiting queue for this lock and we continue the loop with
the holder of that lock
(Lines 22 and 23). After all current precedences have been updated, we finally need
to block the current thread, because the lock it asked for was taken (Line 25).
If the lock the current thread asked for is \emph{not} taken, we proceed with the else-branch
(Lines 26 to 30). We first decrease the value of the lock to 0, meaning
it is taken now (Line 27). Second, we update the reference of the holder of
the lock (Line 28), and finally update the queue of locks the current
thread already possesses (Line 29).
The very last step is to enable interrupts again thus leaving the protected section.
Similar operations need to be implementated for the @{ML_text lock_release} function, which
we however do not show. The reader should note though that we did \emph{not} verify our C-code.
This is in contrast, for example, to the work on seL4, which actually verified in Isabelle/HOL
that their C-code satisfies its specification, thought this specification does not contain
anything about PIP \cite{sel4}.
Our verification of PIP however provided us with the justification for designing
the C-code. It gave us confidence that leaving the ``chase'' early, whenever
there is no change in the calculated current precedence, does not break the
correctness of the algorithm.
*}
section {* Conclusion *}
text {*
The Priority Inheritance Protocol (PIP) is a classic textbook
algorithm used in many real-time operating systems in order to avoid the problem of
Priority Inversion. Although classic and widely used, PIP does have
its faults: for example it does not prevent deadlocks in cases where threads
have circular lock dependencies.
We had two goals in mind with our formalisation of PIP: One is to
make the notions in the correctness proof by Sha et al.~\cite{Sha90}
precise so that they can be processed by a theorem prover. The reason is
that a mechanically checked proof avoids the flaws that crept into their
informal reasoning. We achieved this goal: The correctness of PIP now
only hinges on the assumptions behind our formal model. The reasoning, which is
sometimes quite intricate and tedious, has been checked by Isabelle/HOL.
We can also confirm that Paulson's
inductive method for protocol verification~\cite{Paulson98} is quite
suitable for our formal model and proof. The traditional application
area of this method is security protocols.
The second goal of our formalisation is to provide a specification for actually
implementing PIP. Textbooks, for example \cite[Section 5.6.5]{Vahalia96},
explain how to use various implementations of PIP and abstractly
discuss their properties, but surprisingly lack most details important for a
programmer who wants to implement PIP (similarly Sha et al.~\cite{Sha90}).
That this is an issue in practice is illustrated by the
email from Baker we cited in the Introduction. We achieved also this
goal: The formalisation allowed us to efficently implement our version
of PIP on top of PINTOS \cite{PINTOS}, a simple instructional operating system for the x86
architecture. It also gives the first author enough data to enable
his undergraduate students to implement PIP (as part of their OS course).
A byproduct of our formalisation effort is that nearly all
design choices for the implementation of PIP scheduler are backed up with a proved
lemma. We were also able to establish the property that the choice of
the next thread which takes over a lock is irrelevant for the correctness
of PIP. Moreover, we eliminated a crucial restriction present in
the proof of Sha et al.: they require that critical sections nest properly,
whereas our scheduler allows critical sections to overlap. What we
are not able to do is to mechanically ``synthesise'' an actual implementation
from our formalisation. To do so for C-code seems quite hard and is beyond
current technology available for Isabelle. Also our proof-method based
on events is not ``computational'' in the sense of having a concrete
algorithm behind it: our formalisation is really more about the
specification of PIP and ensuring that it has the desired properties
(the informal specification by Sha et al.~did not).
PIP is a scheduling algorithm for single-processor systems. We are
now living in a multi-processor world. Priority Inversion certainly
occurs also there, see for example \cite{Brandenburg11,Davis11}.
However, there is very little ``foundational''
work about PIP-algorithms on multi-processor systems. We are not
aware of any correctness proofs, not even informal ones. There is an
implementation of a PIP-algorithm for multi-processors as part of the
``real-time'' effort in Linux, including an informal description of the implemented scheduling
algorithm given in \cite{LINUX}. We estimate that the formal
verification of this algorithm, involving more fine-grained events,
is a magnitude harder than the one we presented here, but still
within reach of current theorem proving technology. We leave this
for future work.
To us, it seems sound reasoning about scheduling algorithms is fiendishly difficult
if done informally by ``pencil-and-paper''. We infer this from the flawed proof
in the paper by Sha et al.~\cite{Sha90} and also from \cite{Regehr} where Regehr
points out an error in a paper about Preemption
Threshold Scheduling \cite{ThreadX}. The use of a theorem prover was
invaluable to us in order to be confident about the correctness of our reasoning
(for example no corner case can be overlooked).
The most closely related work to ours is the formal verification in
PVS of the Priority Ceiling Protocol done by Dutertre
\cite{dutertre99b}---another solution to the Priority Inversion
problem, which however needs static analysis of programs in order to
avoid it. There have been earlier formal investigations
into PIP \cite{Faria08,Jahier09,Wellings07}, but they employ model
checking techniques. The results obtained by them apply,
however, only to systems with a fixed size, such as a fixed number of
events and threads. In contrast, our result applies to systems of arbitrary
size. Moreover, our result is a good
witness for one of the major reasons to be interested in machine checked
reasoning: gaining deeper understanding of the subject matter.
Our formalisation
consists of around 210 lemmas and overall 6950 lines of readable Isabelle/Isar
code with a few apply-scripts interspersed. The formal model of PIP
is 385 lines long; the formal correctness proof 3800 lines. Some auxiliary
definitions and proofs span over 770 lines of code. The properties relevant
for an implementation require 2000 lines.
The code of our formalisation
can be downloaded from the Mercurial repository at
\url{http://www.dcs.kcl.ac.uk/staff/urbanc/cgi-bin/repos.cgi/pip}.
%\medskip
%\noindent
%{\bf Acknowledgements:}
%We are grateful for the comments we received from anonymous
%referees.
\bibliographystyle{plain}
\bibliography{root}
*}
(*<*)
end
(*>*)