diff -r 2c56b20032a7 -r 0679a84b11ad prio/Paper/Paper.thy --- a/prio/Paper/Paper.thy Mon Dec 03 08:16:58 2012 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1345 +0,0 @@ -(*<*) -theory Paper -imports "../CpsG" "../ExtGG" "~~/src/HOL/Library/LaTeXsugar" -begin - -(* -find_unused_assms CpsG -find_unused_assms ExtGG -find_unused_assms Moment -find_unused_assms Precedence_ord -find_unused_assms PrioG -find_unused_assms PrioGDef -*) - -ML {* - open Printer; - show_question_marks_default := false; - *} - -notation (latex output) - Cons ("_::_" [78,77] 73) and - vt ("valid'_state") and - runing ("running") and - birthtime ("last'_set") and - If ("(\<^raw:\textrm{>if\<^raw:}> (_)/ \<^raw:\textrm{>then\<^raw:}> (_)/ \<^raw:\textrm{>else\<^raw:}> (_))" 10) and - Prc ("'(_, _')") and - holding ("holds") and - waiting ("waits") and - Th ("T") and - Cs ("C") and - readys ("ready") and - depend ("RAG") and - preced ("prec") and - cpreced ("cprec") and - dependents ("dependants") and - cp ("cprec") and - holdents ("resources") and - original_priority ("priority") and - DUMMY ("\<^raw:\mbox{$\_\!\_$}>") - -(*abbreviation - "detached s th \ cntP s th = cntV s th" -*) -(*>*) - -section {* Introduction *} - -text {* - Many real-time systems need to support threads involving priorities and - locking of resources. Locking of resources ensures mutual exclusion - when accessing shared data or devices that cannot be - preempted. Priorities allow scheduling of threads that need to - finish their work within deadlines. Unfortunately, both features - can interact in subtle ways leading to a problem, called - \emph{Priority Inversion}. Suppose three threads having priorities - $H$(igh), $M$(edium) and $L$(ow). We would expect that the thread - $H$ blocks any other thread with lower priority and the thread itself cannot - be blocked indefinitely by threads with lower priority. Alas, in a naive - implementation of resource locking and priorities this property can - be violated. For this let $L$ be in the - possession of a lock for a resource that $H$ also needs. $H$ must - therefore wait for $L$ to exit the critical section and release this - lock. The problem is that $L$ might in turn be blocked by any - thread with priority $M$, and so $H$ sits there potentially waiting - indefinitely. Since $H$ is blocked by threads with lower - priorities, the problem is called Priority Inversion. It was first - described in \cite{Lampson80} in the context of the - Mesa programming language designed for concurrent programming. - - If the problem of Priority Inversion is ignored, real-time systems - can become unpredictable and resulting bugs can be hard to diagnose. - The classic example where this happened is the software that - controlled the Mars Pathfinder mission in 1997 \cite{Reeves98}. - Once the spacecraft landed, the software shut down at irregular - intervals leading to loss of project time as normal operation of the - craft could only resume the next day (the mission and data already - collected were fortunately not lost, because of a clever system - design). The reason for the shutdowns was that the scheduling - software fell victim to Priority Inversion: a low priority thread - locking a resource prevented a high priority thread from running in - time, leading to a system reset. Once the problem was found, it was - rectified by enabling the \emph{Priority Inheritance Protocol} (PIP) - \cite{Sha90}\footnote{Sha et al.~call it the \emph{Basic Priority - Inheritance Protocol} \cite{Sha90} and others sometimes also call it - \emph{Priority Boosting} or \emph{Priority Donation}.} in the scheduling software. - - The idea behind PIP is to let the thread $L$ temporarily inherit - the high priority from $H$ until $L$ leaves the critical section - unlocking the resource. This solves the problem of $H$ having to - wait indefinitely, because $L$ cannot be blocked by threads having - priority $M$. While a few other solutions exist for the Priority - Inversion problem, PIP is one that is widely deployed and - implemented. This includes VxWorks (a proprietary real-time OS used - in the Mars Pathfinder mission, in Boeing's 787 Dreamliner, Honda's - ASIMO robot, etc.), but also the POSIX 1003.1c Standard realised for - example in libraries for FreeBSD, Solaris and Linux. - - One advantage of PIP is that increasing the priority of a thread - can be dynamically calculated by the scheduler. This is in contrast - to, for example, \emph{Priority Ceiling} \cite{Sha90}, another - solution to the Priority Inversion problem, which requires static - analysis of the program in order to prevent Priority - Inversion. However, there has also been strong criticism against - PIP. For instance, PIP cannot prevent deadlocks when lock - dependencies are circular, and also blocking times can be - substantial (more than just the duration of a critical section). - Though, most criticism against PIP centres around unreliable - implementations and PIP being too complicated and too inefficient. - For example, Yodaiken writes in \cite{Yodaiken02}: - - \begin{quote} - \it{}``Priority inheritance is neither efficient nor reliable. Implementations - are either incomplete (and unreliable) or surprisingly complex and intrusive.'' - \end{quote} - - \noindent - He suggests avoiding PIP altogether by designing the system so that no - priority inversion may happen in the first place. However, such ideal designs may - not always be achievable in practice. - - In our opinion, there is clearly a need for investigating correct - algorithms for PIP. A few specifications for PIP exist (in English) - and also a few high-level descriptions of implementations (e.g.~in - the textbook \cite[Section 5.6.5]{Vahalia96}), but they help little - with actual implementations. That this is a problem in practice is - proved by an email by Baker, who wrote on 13 July 2009 on the Linux - Kernel mailing list: - - \begin{quote} - \it{}``I observed in the kernel code (to my disgust), the Linux PIP - implementation is a nightmare: extremely heavy weight, involving - maintenance of a full wait-for graph, and requiring updates for a - range of events, including priority changes and interruptions of - wait operations.'' - \end{quote} - - \noindent - The criticism by Yodaiken, Baker and others suggests another look - at PIP from a more abstract level (but still concrete enough - to inform an implementation), and makes PIP a good candidate for a - formal verification. An additional reason is that the original - presentation of PIP~\cite{Sha90}, despite being informally - ``proved'' correct, is actually \emph{flawed}. - - Yodaiken \cite{Yodaiken02} points to a subtlety that had been - overlooked in the informal proof by Sha et al. They specify in - \cite{Sha90} that after the thread (whose priority has been raised) - completes its critical section and releases the lock, it ``returns - to its original priority level.'' This leads them to believe that an - implementation of PIP is ``rather straightforward''~\cite{Sha90}. - Unfortunately, as Yodaiken points out, this behaviour is too - simplistic. Consider the case where the low priority thread $L$ - locks \emph{two} resources, and two high-priority threads $H$ and - $H'$ each wait for one of them. If $L$ releases one resource - so that $H$, say, can proceed, then we still have Priority Inversion - with $H'$ (which waits for the other resource). The correct - behaviour for $L$ is to switch to the highest remaining priority of - the threads that it blocks. The advantage of formalising the - correctness of a high-level specification of PIP in a theorem prover - is that such issues clearly show up and cannot be overlooked as in - informal reasoning (since we have to analyse all possible behaviours - of threads, i.e.~\emph{traces}, that could possibly happen).\medskip - - \noindent - {\bf Contributions:} There have been earlier formal investigations - into PIP \cite{Faria08,Jahier09,Wellings07}, but they employ model - checking techniques. This paper presents a formalised and - mechanically checked proof for the correctness of PIP (to our - knowledge the first one). In contrast to model checking, our - formalisation provides insight into why PIP is correct and allows us - to prove stronger properties that, as we will show, can help with an - efficient implementation of PIP in the educational PINTOS operating - system \cite{PINTOS}. For example, we found by ``playing'' with the - formalisation that the choice of the next thread to take over a lock - when a resource is released is irrelevant for PIP being correct---a - fact that has not been mentioned in the literature and not been used - in the reference implementation of PIP in PINTOS. This fact, however, is important - for an efficient implementation of PIP, because we can give the lock - to the thread with the highest priority so that it terminates more - quickly. -*} - -section {* Formal Model of the Priority Inheritance Protocol *} - -text {* - The Priority Inheritance Protocol, short PIP, is a scheduling - algorithm for a single-processor system.\footnote{We shall come back - later to the case of PIP on multi-processor systems.} - Following good experience in earlier work \cite{Wang09}, - our model of PIP is based on Paulson's inductive approach to protocol - verification \cite{Paulson98}. In this approach a \emph{state} of a system is - given by a list of events that happened so far (with new events prepended to the list). - \emph{Events} of PIP fall - into five categories defined as the datatype: - - \begin{isabelle}\ \ \ \ \ %%% - \mbox{\begin{tabular}{r@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {\hspace{7mm}}l} - \isacommand{datatype} event - & @{text "="} & @{term "Create thread priority"}\\ - & @{text "|"} & @{term "Exit thread"} \\ - & @{text "|"} & @{term "Set thread priority"} & {\rm reset of the priority for} @{text thread}\\ - & @{text "|"} & @{term "P thread cs"} & {\rm request of resource} @{text "cs"} {\rm by} @{text "thread"}\\ - & @{text "|"} & @{term "V thread cs"} & {\rm release of resource} @{text "cs"} {\rm by} @{text "thread"} - \end{tabular}} - \end{isabelle} - - \noindent - whereby threads, priorities and (critical) resources are represented - as natural numbers. The event @{term Set} models the situation that - a thread obtains a new priority given by the programmer or - user (for example via the {\tt nice} utility under UNIX). As in Paulson's work, we - need to define functions that allow us to make some observations - about states. One, called @{term threads}, calculates the set of - ``live'' threads that we have seen so far: - - \begin{isabelle}\ \ \ \ \ %%% - \mbox{\begin{tabular}{lcl} - @{thm (lhs) threads.simps(1)} & @{text "\"} & - @{thm (rhs) threads.simps(1)}\\ - @{thm (lhs) threads.simps(2)[where thread="th"]} & @{text "\"} & - @{thm (rhs) threads.simps(2)[where thread="th"]}\\ - @{thm (lhs) threads.simps(3)[where thread="th"]} & @{text "\"} & - @{thm (rhs) threads.simps(3)[where thread="th"]}\\ - @{term "threads (DUMMY#s)"} & @{text "\"} & @{term "threads s"}\\ - \end{tabular}} - \end{isabelle} - - \noindent - In this definition @{term "DUMMY # DUMMY"} stands for list-cons. - Another function calculates the priority for a thread @{text "th"}, which is - defined as - - \begin{isabelle}\ \ \ \ \ %%% - \mbox{\begin{tabular}{lcl} - @{thm (lhs) original_priority.simps(1)[where thread="th"]} & @{text "\"} & - @{thm (rhs) original_priority.simps(1)[where thread="th"]}\\ - @{thm (lhs) original_priority.simps(2)[where thread="th" and thread'="th'"]} & @{text "\"} & - @{thm (rhs) original_priority.simps(2)[where thread="th" and thread'="th'"]}\\ - @{thm (lhs) original_priority.simps(3)[where thread="th" and thread'="th'"]} & @{text "\"} & - @{thm (rhs) original_priority.simps(3)[where thread="th" and thread'="th'"]}\\ - @{term "original_priority th (DUMMY#s)"} & @{text "\"} & @{term "original_priority th s"}\\ - \end{tabular}} - \end{isabelle} - - \noindent - In this definition we set @{text 0} as the default priority for - threads that have not (yet) been created. The last function we need - calculates the ``time'', or index, at which time a process had its - priority last set. - - \begin{isabelle}\ \ \ \ \ %%% - \mbox{\begin{tabular}{lcl} - @{thm (lhs) birthtime.simps(1)[where thread="th"]} & @{text "\"} & - @{thm (rhs) birthtime.simps(1)[where thread="th"]}\\ - @{thm (lhs) birthtime.simps(2)[where thread="th" and thread'="th'"]} & @{text "\"} & - @{thm (rhs) birthtime.simps(2)[where thread="th" and thread'="th'"]}\\ - @{thm (lhs) birthtime.simps(3)[where thread="th" and thread'="th'"]} & @{text "\"} & - @{thm (rhs) birthtime.simps(3)[where thread="th" and thread'="th'"]}\\ - @{term "birthtime th (DUMMY#s)"} & @{text "\"} & @{term "birthtime th s"}\\ - \end{tabular}} - \end{isabelle} - - \noindent - In this definition @{term "length s"} stands for the length of the list - of events @{text s}. Again the default value in this function is @{text 0} - for threads that have not been created yet. A \emph{precedence} of a thread @{text th} in a - state @{text s} is the pair of natural numbers defined as - - \begin{isabelle}\ \ \ \ \ %%% - @{thm preced_def[where thread="th"]} - \end{isabelle} - - \noindent - The point of precedences is to schedule threads not according to priorities (because what should - we do in case two threads have the same priority), but according to precedences. - Precedences allow us to always discriminate between two threads with equal priority by - taking into account the time when the priority was last set. We order precedences so - that threads with the same priority get a higher precedence if their priority has been - set earlier, since for such threads it is more urgent to finish their work. In an implementation - this choice would translate to a quite natural FIFO-scheduling of processes with - the same priority. - - Next, we introduce the concept of \emph{waiting queues}. They are - lists of threads associated with every resource. The first thread in - this list (i.e.~the head, or short @{term hd}) is chosen to be the one - that is in possession of the - ``lock'' of the corresponding resource. We model waiting queues as - functions, below abbreviated as @{text wq}. They take a resource as - argument and return a list of threads. This allows us to define - when a thread \emph{holds}, respectively \emph{waits} for, a - resource @{text cs} given a waiting queue function @{text wq}. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm cs_holding_def[where thread="th"]}\\ - @{thm cs_waiting_def[where thread="th"]} - \end{tabular} - \end{isabelle} - - \noindent - In this definition we assume @{text "set"} converts a list into a set. - At the beginning, that is in the state where no thread is created yet, - the waiting queue function will be the function that returns the - empty list for every resource. - - \begin{isabelle}\ \ \ \ \ %%% - @{abbrev all_unlocked}\hfill\numbered{allunlocked} - \end{isabelle} - - \noindent - Using @{term "holding"} and @{term waiting}, we can introduce \emph{Resource Allocation Graphs} - (RAG), which represent the dependencies between threads and resources. - We represent RAGs as relations using pairs of the form - - \begin{isabelle}\ \ \ \ \ %%% - @{term "(Th th, Cs cs)"} \hspace{5mm}{\rm and}\hspace{5mm} - @{term "(Cs cs, Th th)"} - \end{isabelle} - - \noindent - where the first stands for a \emph{waiting edge} and the second for a - \emph{holding edge} (@{term Cs} and @{term Th} are constructors of a - datatype for vertices). Given a waiting queue function, a RAG is defined - as the union of the sets of waiting and holding edges, namely - - \begin{isabelle}\ \ \ \ \ %%% - @{thm cs_depend_def} - \end{isabelle} - - \noindent - Given four threads and three resources, an instance of a RAG can be pictured - as follows: - - \begin{center} - \newcommand{\fnt}{\fontsize{7}{8}\selectfont} - \begin{tikzpicture}[scale=1] - %%\draw[step=2mm] (-3,2) grid (1,-1); - - \node (A) at (0,0) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^isub>0"}}; - \node (B) at (2,0) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^isub>1"}}; - \node (C) at (4,0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^isub>1"}}; - \node (D) at (4,-0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^isub>2"}}; - \node (E) at (6,-0.7) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^isub>2"}}; - \node (E1) at (6, 0.2) [draw, circle, very thick, inner sep=0.4mm] {@{text "cs\<^isub>3"}}; - \node (F) at (8,-0.7) [draw, rounded corners=1mm, rectangle, very thick] {@{text "th\<^isub>3"}}; - - \draw [<-,line width=0.6mm] (A) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (B); - \draw [->,line width=0.6mm] (C) to node [pos=0.4,sloped,above=-0.5mm] {\fnt{}waiting} (B); - \draw [->,line width=0.6mm] (D) to node [pos=0.4,sloped,below=-0.5mm] {\fnt{}waiting} (B); - \draw [<-,line width=0.6mm] (D) to node [pos=0.54,sloped,below=-0.5mm] {\fnt{}holding} (E); - \draw [<-,line width=0.6mm] (D) to node [pos=0.54,sloped,above=-0.5mm] {\fnt{}holding} (E1); - \draw [->,line width=0.6mm] (F) to node [pos=0.45,sloped,below=-0.5mm] {\fnt{}waiting} (E); - \end{tikzpicture} - \end{center} - - \noindent - The use of relations for representing RAGs allows us to conveniently define - the notion of the \emph{dependants} of a thread using the transitive closure - operation for relations. This gives - - \begin{isabelle}\ \ \ \ \ %%% - @{thm cs_dependents_def} - \end{isabelle} - - \noindent - This definition needs to account for all threads that wait for a thread to - release a resource. This means we need to include threads that transitively - wait for a resource being released (in the picture above this means the dependants - of @{text "th\<^isub>0"} are @{text "th\<^isub>1"} and @{text "th\<^isub>2"}, which wait for resource @{text "cs\<^isub>1"}, - but also @{text "th\<^isub>3"}, - which cannot make any progress unless @{text "th\<^isub>2"} makes progress, which - in turn needs to wait for @{text "th\<^isub>0"} to finish). If there is a circle of dependencies - in a RAG, then clearly - we have a deadlock. Therefore when a thread requests a resource, - we must ensure that the resulting RAG is not circular. In practice, the - programmer has to ensure this. - - - Next we introduce the notion of the \emph{current precedence} of a thread @{text th} in a - state @{text s}. It is defined as - - \begin{isabelle}\ \ \ \ \ %%% - @{thm cpreced_def2}\hfill\numbered{cpreced} - \end{isabelle} - - \noindent - where the dependants of @{text th} are given by the waiting queue function. - While the precedence @{term prec} of a thread is determined statically - (for example when the thread is - created), the point of the current precedence is to let the scheduler increase this - precedence, if needed according to PIP. Therefore the current precedence of @{text th} is - given as the maximum of the precedence @{text th} has in state @{text s} \emph{and} all - threads that are dependants of @{text th}. Since the notion @{term "dependants"} is - defined as the transitive closure of all dependent threads, we deal correctly with the - problem in the informal algorithm by Sha et al.~\cite{Sha90} where a priority of a thread is - lowered prematurely. - - The next function, called @{term schs}, defines the behaviour of the scheduler. It will be defined - by recursion on the state (a list of events); this function returns a \emph{schedule state}, which - we represent as a record consisting of two - functions: - - \begin{isabelle}\ \ \ \ \ %%% - @{text "\wq_fun, cprec_fun\"} - \end{isabelle} - - \noindent - The first function is a waiting queue function (that is, it takes a - resource @{text "cs"} and returns the corresponding list of threads - that lock, respectively wait for, it); the second is a function that - takes a thread and returns its current precedence (see - the definition in \eqref{cpreced}). We assume the usual getter and setter methods for - such records. - - In the initial state, the scheduler starts with all resources unlocked (the corresponding - function is defined in \eqref{allunlocked}) and the - current precedence of every thread is initialised with @{term "Prc 0 0"}; that means - \mbox{@{abbrev initial_cprec}}. Therefore - we have for the initial shedule state - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm (lhs) schs.simps(1)} @{text "\"}\\ - \hspace{5mm}@{term "(|wq_fun = all_unlocked, cprec_fun = (\_::thread. Prc 0 0)|)"} - \end{tabular} - \end{isabelle} - - \noindent - The cases for @{term Create}, @{term Exit} and @{term Set} are also straightforward: - we calculate the waiting queue function of the (previous) state @{text s}; - this waiting queue function @{text wq} is unchanged in the next schedule state---because - none of these events lock or release any resource; - for calculating the next @{term "cprec_fun"}, we use @{text wq} and - @{term cpreced}. This gives the following three clauses for @{term schs}: - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm (lhs) schs.simps(2)} @{text "\"}\\ - \hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\ - \hspace{8mm}@{term "(|wq_fun = wq\, cprec_fun = cpreced wq\ (Create th prio # s)|)"}\smallskip\\ - @{thm (lhs) schs.simps(3)} @{text "\"}\\ - \hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\ - \hspace{8mm}@{term "(|wq_fun = wq\, cprec_fun = cpreced wq\ (Exit th # s)|)"}\smallskip\\ - @{thm (lhs) schs.simps(4)} @{text "\"}\\ - \hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\ - \hspace{8mm}@{term "(|wq_fun = wq\, cprec_fun = cpreced wq\ (Set th prio # s)|)"} - \end{tabular} - \end{isabelle} - - \noindent - More interesting are the cases where a resource, say @{text cs}, is locked or released. In these cases - we need to calculate a new waiting queue function. For the event @{term "P th cs"}, we have to update - the function so that the new thread list for @{text cs} is the old thread list plus the thread @{text th} - appended to the end of that list (remember the head of this list is assigned to be in the possession of this - resource). This gives the clause - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm (lhs) schs.simps(5)} @{text "\"}\\ - \hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\ - \hspace{5mm}@{text "let"} @{text "new_wq = wq(cs := (wq cs @ [th]))"} @{text "in"}\\ - \hspace{8mm}@{term "(|wq_fun = new_wq, cprec_fun = cpreced new_wq (P th cs # s)|)"} - \end{tabular} - \end{isabelle} - - \noindent - The clause for event @{term "V th cs"} is similar, except that we need to update the waiting queue function - so that the thread that possessed the lock is deleted from the corresponding thread list. For this - list transformation, we use - the auxiliary function @{term release}. A simple version of @{term release} would - just delete this thread and return the remaining threads, namely - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}lcl} - @{term "release []"} & @{text "\"} & @{term "[]"}\\ - @{term "release (DUMMY # qs)"} & @{text "\"} & @{term "qs"}\\ - \end{tabular} - \end{isabelle} - - \noindent - In practice, however, often the thread with the highest precedence in the list will get the - lock next. We have implemented this choice, but later found out that the choice - of which thread is chosen next is actually irrelevant for the correctness of PIP. - Therefore we prove the stronger result where @{term release} is defined as - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}lcl} - @{term "release []"} & @{text "\"} & @{term "[]"}\\ - @{term "release (DUMMY # qs)"} & @{text "\"} & @{term "SOME qs'. distinct qs' \ set qs' = set qs"}\\ - \end{tabular} - \end{isabelle} - - \noindent - where @{text "SOME"} stands for Hilbert's epsilon and implements an arbitrary - choice for the next waiting list. It just has to be a list of distinctive threads and - contain the same elements as @{text "qs"}. This gives for @{term V} the clause: - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm (lhs) schs.simps(6)} @{text "\"}\\ - \hspace{5mm}@{text "let"} @{text "wq = wq_fun (schs s)"} @{text "in"}\\ - \hspace{5mm}@{text "let"} @{text "new_wq = release (wq cs)"} @{text "in"}\\ - \hspace{8mm}@{term "(|wq_fun = new_wq, cprec_fun = cpreced new_wq (V th cs # s)|)"} - \end{tabular} - \end{isabelle} - - Having the scheduler function @{term schs} at our disposal, we can ``lift'', or - overload, the notions - @{term waiting}, @{term holding}, @{term depend} and @{term cp} to operate on states only. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}rcl} - @{thm (lhs) s_holding_abv} & @{text "\"} & @{thm (rhs) s_holding_abv}\\ - @{thm (lhs) s_waiting_abv} & @{text "\"} & @{thm (rhs) s_waiting_abv}\\ - @{thm (lhs) s_depend_abv} & @{text "\"} & @{thm (rhs) s_depend_abv}\\ - @{thm (lhs) cp_def} & @{text "\"} & @{thm (rhs) cp_def} - \end{tabular} - \end{isabelle} - - \noindent - With these abbreviations in place we can introduce - the notion of a thread being @{term ready} in a state (i.e.~threads - that do not wait for any resource) and the running thread. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm readys_def}\\ - @{thm runing_def} - \end{tabular} - \end{isabelle} - - \noindent - In the second definition @{term "DUMMY ` DUMMY"} stands for the image of a set under a function. - Note that in the initial state, that is where the list of events is empty, the set - @{term threads} is empty and therefore there is neither a thread ready nor running. - If there is one or more threads ready, then there can only be \emph{one} thread - running, namely the one whose current precedence is equal to the maximum of all ready - threads. We use sets to capture both possibilities. - We can now also conveniently define the set of resources that are locked by a thread in a - given state and also when a thread is detached that state (meaning the thread neither - holds nor waits for a resource): - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm holdents_def}\\ - @{thm detached_def} - \end{tabular} - \end{isabelle} - - %\noindent - %The second definition states that @{text th} in @{text s}. - - Finally we can define what a \emph{valid state} is in our model of PIP. For - example we cannot expect to be able to exit a thread, if it was not - created yet. - These validity constraints on states are characterised by the - inductive predicate @{term "step"} and @{term vt}. We first give five inference rules - for @{term step} relating a state and an event that can happen next. - - \begin{center} - \begin{tabular}{c} - @{thm[mode=Rule] thread_create[where thread=th]}\hspace{1cm} - @{thm[mode=Rule] thread_exit[where thread=th]} - \end{tabular} - \end{center} - - \noindent - The first rule states that a thread can only be created, if it is not alive yet. - Similarly, the second rule states that a thread can only be terminated if it was - running and does not lock any resources anymore (this simplifies slightly our model; - in practice we would expect the operating system releases all locks held by a - thread that is about to exit). The event @{text Set} can happen - if the corresponding thread is running. - - \begin{center} - @{thm[mode=Rule] thread_set[where thread=th]} - \end{center} - - \noindent - If a thread wants to lock a resource, then the thread needs to be - running and also we have to make sure that the resource lock does - not lead to a cycle in the RAG. In practice, ensuring the latter - is the responsibility of the programmer. In our formal - model we brush aside these problematic cases in order to be able to make - some meaningful statements about PIP.\footnote{This situation is - similar to the infamous \emph{occurs check} in Prolog: In order to say - anything meaningful about unification, one needs to perform an occurs - check. But in practice the occurs check is omitted and the - responsibility for avoiding problems rests with the programmer.} - - - \begin{center} - @{thm[mode=Rule] thread_P[where thread=th]} - \end{center} - - \noindent - Similarly, if a thread wants to release a lock on a resource, then - it must be running and in the possession of that lock. This is - formally given by the last inference rule of @{term step}. - - \begin{center} - @{thm[mode=Rule] thread_V[where thread=th]} - \end{center} - - \noindent - A valid state of PIP can then be conveniently be defined as follows: - - \begin{center} - \begin{tabular}{c} - @{thm[mode=Axiom] vt_nil}\hspace{1cm} - @{thm[mode=Rule] vt_cons} - \end{tabular} - \end{center} - - \noindent - This completes our formal model of PIP. In the next section we present - properties that show our model of PIP is correct. -*} - -section {* The Correctness Proof *} - -(*<*) -context extend_highest_gen -begin -(*>*) -text {* - Sha et al.~state their first correctness criterion for PIP in terms - of the number of low-priority threads \cite[Theorem 3]{Sha90}: if - there are @{text n} low-priority threads, then a blocked job with - high priority can only be blocked a maximum of @{text n} times. - Their second correctness criterion is given - in terms of the number of critical resources \cite[Theorem 6]{Sha90}: if there are - @{text m} critical resources, then a blocked job with high priority - can only be blocked a maximum of @{text m} times. Both results on their own, strictly speaking, do - \emph{not} prevent indefinite, or unbounded, Priority Inversion, - because if a low-priority thread does not give up its critical - resource (the one the high-priority thread is waiting for), then the - high-priority thread can never run. The argument of Sha et al.~is - that \emph{if} threads release locked resources in a finite amount - of time, then indefinite Priority Inversion cannot occur---the high-priority - thread is guaranteed to run eventually. The assumption is that - programmers must ensure that threads are programmed in this way. However, even - taking this assumption into account, the correctness properties of - Sha et al.~are - \emph{not} true for their version of PIP---despite being ``proved''. As Yodaiken - \cite{Yodaiken02} pointed out: If a low-priority thread possesses - locks to two resources for which two high-priority threads are - waiting for, then lowering the priority prematurely after giving up - only one lock, can cause indefinite Priority Inversion for one of the - high-priority threads, invalidating their two bounds. - - Even when fixed, their proof idea does not seem to go through for - us, because of the way we have set up our formal model of PIP. One - reason is that we allow critical sections, which start with a @{text P}-event - and finish with a corresponding @{text V}-event, to arbitrarily overlap - (something Sha et al.~explicitly exclude). Therefore we have - designed a different correctness criterion for PIP. The idea behind - our criterion is as follows: for all states @{text s}, we know the - corresponding thread @{text th} with the highest precedence; we show - that in every future state (denoted by @{text "s' @ s"}) in which - @{text th} is still alive, either @{text th} is running or it is - blocked by a thread that was alive in the state @{text s} and was waiting - for or in the possession of a lock in @{text s}. Since in @{text s}, as in - every state, the set of alive threads is finite, @{text th} can only - be blocked a finite number of times. This is independent of how many - threads of lower priority are created in @{text "s'"}. We will actually prove a - stronger statement where we also provide the current precedence of - the blocking thread. However, this correctness criterion hinges upon - a number of assumptions about the states @{text s} and @{text "s' @ - s"}, the thread @{text th} and the events happening in @{text - s'}. We list them next: - - \begin{quote} - {\bf Assumptions on the states {\boldmath@{text s}} and - {\boldmath@{text "s' @ s"}:}} We need to require that @{text "s"} and - @{text "s' @ s"} are valid states: - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{l} - @{term "vt s"}, @{term "vt (s' @ s)"} - \end{tabular} - \end{isabelle} - \end{quote} - - \begin{quote} - {\bf Assumptions on the thread {\boldmath@{text "th"}:}} - The thread @{text th} must be alive in @{text s} and - has the highest precedence of all alive threads in @{text s}. Furthermore the - priority of @{text th} is @{text prio} (we need this in the next assumptions). - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{l} - @{term "th \ threads s"}\\ - @{term "prec th s = Max (cprec s ` threads s)"}\\ - @{term "prec th s = (prio, DUMMY)"} - \end{tabular} - \end{isabelle} - \end{quote} - - \begin{quote} - {\bf Assumptions on the events in {\boldmath@{text "s'"}:}} We want to prove that @{text th} cannot - be blocked indefinitely. Of course this can happen if threads with higher priority - than @{text th} are continuously created in @{text s'}. Therefore we have to assume that - events in @{text s'} can only create (respectively set) threads with equal or lower - priority than @{text prio} of @{text th}. We also need to assume that the - priority of @{text "th"} does not get reset and also that @{text th} does - not get ``exited'' in @{text "s'"}. This can be ensured by assuming the following three implications. - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{l} - {If}~~@{text "Create th' prio' \ set s'"}~~{then}~~@{text "prio' \ prio"}\\ - {If}~~@{text "Set th' prio' \ set s'"}~~{then}~~@{text "th' \ th"}~~{and}~~@{text "prio' \ prio"}\\ - {If}~~@{text "Exit th' \ set s'"}~~{then}~~@{text "th' \ th"}\\ - \end{tabular} - \end{isabelle} - \end{quote} - - \noindent - The locale mechanism of Isabelle helps us to manage conveniently such assumptions~\cite{Haftmann08}. - Under these assumptions we shall prove the following correctness property: - - \begin{theorem}\label{mainthm} - Given the assumptions about states @{text "s"} and @{text "s' @ s"}, - the thread @{text th} and the events in @{text "s'"}, - if @{term "th' \ running (s' @ s)"} and @{text "th' \ th"} then - @{text "th' \ threads s"}, @{text "\ detached s th'"} and - @{term "cp (s' @ s) th' = prec th s"}. - \end{theorem} - - \noindent - This theorem ensures that the thread @{text th}, which has the - highest precedence in the state @{text s}, can only be blocked in - the state @{text "s' @ s"} by a thread @{text th'} that already - existed in @{text s} and requested or had a lock on at least - one resource---that means the thread was not \emph{detached} in @{text s}. - As we shall see shortly, that means there are only finitely - many threads that can block @{text th} in this way and then they - need to run with the same current precedence as @{text th}. - - Like in the argument by Sha et al.~our - finite bound does not guarantee absence of indefinite Priority - Inversion. For this we further have to assume that every thread - gives up its resources after a finite amount of time. We found that - this assumption is awkward to formalise in our model. Therefore we - leave it out and let the programmer assume the responsibility to - program threads in such a benign manner (in addition to causing no - circularity in the @{text RAG}). In this detail, we do not - make any progress in comparison with the work by Sha et al. - However, we are able to combine their two separate bounds into a - single theorem improving their bound. - - In what follows we will describe properties of PIP that allow us to prove - Theorem~\ref{mainthm} and, when instructive, briefly describe our argument. - It is relatively easy to see that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{text "running s \ ready s \ threads s"}\\ - @{thm[mode=IfThen] finite_threads} - \end{tabular} - \end{isabelle} - - \noindent - The second property is by induction of @{term vt}. The next three - properties are - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm[mode=IfThen] waiting_unique[of _ _ "cs\<^isub>1" "cs\<^isub>2"]}\\ - @{thm[mode=IfThen] held_unique[of _ "th\<^isub>1" _ "th\<^isub>2"]}\\ - @{thm[mode=IfThen] runing_unique[of _ "th\<^isub>1" "th\<^isub>2"]} - \end{tabular} - \end{isabelle} - - \noindent - The first property states that every waiting thread can only wait for a single - resource (because it gets suspended after requesting that resource); the second - that every resource can only be held by a single thread; - the third property establishes that in every given valid state, there is - at most one running thread. We can also show the following properties - about the @{term RAG} in @{text "s"}. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{text If}~@{thm (prem 1) acyclic_depend}~@{text "then"}:\\ - \hspace{5mm}@{thm (concl) acyclic_depend}, - @{thm (concl) finite_depend} and - @{thm (concl) wf_dep_converse},\\ - \hspace{5mm}@{text "if"}~@{thm (prem 2) dm_depend_threads}~@{text "then"}~@{thm (concl) dm_depend_threads} - and\\ - \hspace{5mm}@{text "if"}~@{thm (prem 2) range_in}~@{text "then"}~@{thm (concl) range_in}. - \end{tabular} - \end{isabelle} - - \noindent - The acyclicity property follows from how we restricted the events in - @{text step}; similarly the finiteness and well-foundedness property. - The last two properties establish that every thread in a @{text "RAG"} - (either holding or waiting for a resource) is a live thread. - - The key lemma in our proof of Theorem~\ref{mainthm} is as follows: - - \begin{lemma}\label{mainlem} - Given the assumptions about states @{text "s"} and @{text "s' @ s"}, - the thread @{text th} and the events in @{text "s'"}, - if @{term "th' \ threads (s' @ s)"}, @{text "th' \ th"} and @{text "detached (s' @ s) th'"}\\ - then @{text "th' \ running (s' @ s)"}. - \end{lemma} - - \noindent - The point of this lemma is that a thread different from @{text th} (which has the highest - precedence in @{text s}) and not holding any resource, cannot be running - in the state @{text "s' @ s"}. - - \begin{proof} - Since thread @{text "th'"} does not hold any resource, no thread can depend on it. - Therefore its current precedence @{term "cp (s' @ s) th'"} equals its own precedence - @{term "prec th' (s' @ s)"}. Since @{text "th"} has the highest precedence in the - state @{text "(s' @ s)"} and precedences are distinct among threads, we have - @{term "prec th' (s' @s ) < prec th (s' @ s)"}. From this - we have @{term "cp (s' @ s) th' < prec th (s' @ s)"}. - Since @{text "prec th (s' @ s)"} is already the highest - @{term "cp (s' @ s) th"} can not be higher than this and can not be lower either (by - definition of @{term "cp"}). Consequently, we have @{term "prec th (s' @ s) = cp (s' @ s) th"}. - Finally we have @{term "cp (s' @ s) th' < cp (s' @ s) th"}. - By defintion of @{text "running"}, @{text "th'"} can not be running in state - @{text "s' @ s"}, as we had to show.\qed - \end{proof} - - \noindent - Since @{text "th'"} is not able to run in state @{text "s' @ s"}, it is not able to - issue a @{text "P"} or @{text "V"} event. Therefore if @{text "s' @ s"} is extended - one step further, @{text "th'"} still cannot hold any resource. The situation will - not change in further extensions as long as @{text "th"} holds the highest precedence. - - From this lemma we can deduce Theorem~\ref{mainthm}: that @{text th} can only be - blocked by a thread @{text th'} that - held some resource in state @{text s} (that is not @{text "detached"}). And furthermore - that the current precedence of @{text th'} in state @{text "(s' @ s)"} must be equal to the - precedence of @{text th} in @{text "s"}. - We show this theorem by induction on @{text "s'"} using Lemma~\ref{mainlem}. - This theorem gives a stricter bound on the threads that can block @{text th} than the - one obtained by Sha et al.~\cite{Sha90}: - only threads that were alive in state @{text s} and moreover held a resource. - This means our bound is in terms of both---alive threads in state @{text s} - and number of critical resources. Finally, the theorem establishes that the blocking threads have the - current precedence raised to the precedence of @{text th}. - - We can furthermore prove that under our assumptions no deadlock exists in the state @{text "s' @ s"} - by showing that @{text "running (s' @ s)"} is not empty. - - \begin{lemma} - Given the assumptions about states @{text "s"} and @{text "s' @ s"}, - the thread @{text th} and the events in @{text "s'"}, - @{term "running (s' @ s) \ {}"}. - \end{lemma} - - \begin{proof} - If @{text th} is blocked, then by following its dependants graph, we can always - reach a ready thread @{text th'}, and that thread must have inherited the - precedence of @{text th}.\qed - \end{proof} - - - %The following lemmas show how every node in RAG can be chased to ready threads: - %\begin{enumerate} - %\item Every node in RAG can be chased to a ready thread (@{text "chain_building"}): - % @ {thm [display] chain_building[rule_format]} - %\item The ready thread chased to is unique (@{text "dchain_unique"}): - % @ {thm [display] dchain_unique[of _ _ "th\<^isub>1" "th\<^isub>2"]} - %\end{enumerate} - - %Some deeper results about the system: - %\begin{enumerate} - %\item The maximum of @{term "cp"} and @{term "preced"} are equal (@{text "max_cp_eq"}): - %@ {thm [display] max_cp_eq} - %\item There must be one ready thread having the max @{term "cp"}-value - %(@{text "max_cp_readys_threads"}): - %@ {thm [display] max_cp_readys_threads} - %\end{enumerate} - - %The relationship between the count of @{text "P"} and @{text "V"} and the number of - %critical resources held by a thread is given as follows: - %\begin{enumerate} - %\item The @{term "V"}-operation decreases the number of critical resources - % one thread holds (@{text "cntCS_v_dec"}) - % @ {thm [display] cntCS_v_dec} - %\item The number of @{text "V"} never exceeds the number of @{text "P"} - % (@ {text "cnp_cnv_cncs"}): - % @ {thm [display] cnp_cnv_cncs} - %\item The number of @{text "V"} equals the number of @{text "P"} when - % the relevant thread is not living: - % (@{text "cnp_cnv_eq"}): - % @ {thm [display] cnp_cnv_eq} - %\item When a thread is not living, it does not hold any critical resource - % (@{text "not_thread_holdents"}): - % @ {thm [display] not_thread_holdents} - %\item When the number of @{text "P"} equals the number of @{text "V"}, the relevant - % thread does not hold any critical resource, therefore no thread can depend on it - % (@{text "count_eq_dependents"}): - % @ {thm [display] count_eq_dependents} - %\end{enumerate} - - %The reason that only threads which already held some resoures - %can be runing and block @{text "th"} is that if , otherwise, one thread - %does not hold any resource, it may never have its prioirty raised - %and will not get a chance to run. This fact is supported by - %lemma @{text "moment_blocked"}: - %@ {thm [display] moment_blocked} - %When instantiating @{text "i"} to @{text "0"}, the lemma means threads which did not hold any - %resource in state @{text "s"} will not have a change to run latter. Rephrased, it means - %any thread which is running after @{text "th"} became the highest must have already held - %some resource at state @{text "s"}. - - - %When instantiating @{text "i"} to a number larger than @{text "0"}, the lemma means - %if a thread releases all its resources at some moment in @{text "t"}, after that, - %it may never get a change to run. If every thread releases its resource in finite duration, - %then after a while, only thread @{text "th"} is left running. This shows how indefinite - %priority inversion can be avoided. - - %All these assumptions are put into a predicate @{term "extend_highest_gen"}. - %It can be proved that @{term "extend_highest_gen"} holds - %for any moment @{text "i"} in it @{term "t"} (@{text "red_moment"}): - %@ {thm [display] red_moment} - - %From this, an induction principle can be derived for @{text "t"}, so that - %properties already derived for @{term "t"} can be applied to any prefix - %of @{text "t"} in the proof of new properties - %about @{term "t"} (@{text "ind"}): - %\begin{center} - %@ {thm[display] ind} - %\end{center} - - %The following properties can be proved about @{term "th"} in @{term "t"}: - %\begin{enumerate} - %\item In @{term "t"}, thread @{term "th"} is kept live and its - % precedence is preserved as well - % (@{text "th_kept"}): - % @ {thm [display] th_kept} - %\item In @{term "t"}, thread @{term "th"}'s precedence is always the maximum among - % all living threads - % (@{text "max_preced"}): - % @ {thm [display] max_preced} - %\item In @{term "t"}, thread @{term "th"}'s current precedence is always the maximum precedence - % among all living threads - % (@{text "th_cp_max_preced"}): - % @ {thm [display] th_cp_max_preced} - %\item In @{term "t"}, thread @{term "th"}'s current precedence is always the maximum current - % precedence among all living threads - % (@{text "th_cp_max"}): - % @ {thm [display] th_cp_max} - %\item In @{term "t"}, thread @{term "th"}'s current precedence equals its precedence at moment - % @{term "s"} - % (@{text "th_cp_preced"}): - % @ {thm [display] th_cp_preced} - %\end{enumerate} - - %The main theorem of this part is to characterizing the running thread during @{term "t"} - %(@{text "runing_inversion_2"}): - %@ {thm [display] runing_inversion_2} - %According to this, if a thread is running, it is either @{term "th"} or was - %already live and held some resource - %at moment @{text "s"} (expressed by: @{text "cntV s th' < cntP s th'"}). - - %Since there are only finite many threads live and holding some resource at any moment, - %if every such thread can release all its resources in finite duration, then after finite - %duration, none of them may block @{term "th"} anymore. So, no priority inversion may happen - %then. - *} -(*<*) -end -(*>*) - -section {* Properties for an Implementation\label{implement} *} - -text {* - While our formalised proof gives us confidence about the correctness of our model of PIP, - we found that the formalisation can even help us with efficiently implementing it. - - For example Baker complained that calculating the current precedence - in PIP is quite ``heavy weight'' in Linux (see the Introduction). - In our model of PIP the current precedence of a thread in a state @{text s} - depends on all its dependants---a ``global'' transitive notion, - which is indeed heavy weight (see Def.~shown in \eqref{cpreced}). - We can however improve upon this. For this let us define the notion - of @{term children} of a thread @{text th} in a state @{text s} as - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm children_def2} - \end{tabular} - \end{isabelle} - - \noindent - where a child is a thread that is only one ``hop'' away from the thread - @{text th} in the @{term RAG} (and waiting for @{text th} to release - a resource). We can prove the following lemma. - - \begin{lemma}\label{childrenlem} - @{text "If"} @{thm (prem 1) cp_rec} @{text "then"} - \begin{center} - @{thm (concl) cp_rec}. - \end{center} - \end{lemma} - - \noindent - That means the current precedence of a thread @{text th} can be - computed locally by considering only the children of @{text th}. In - effect, it only needs to be recomputed for @{text th} when one of - its children changes its current precedence. Once the current - precedence is computed in this more efficient manner, the selection - of the thread with highest precedence from a set of ready threads is - a standard scheduling operation implemented in most operating - systems. - - Of course the main work for implementing PIP involves the - scheduler and coding how it should react to events. Below we - outline how our formalisation guides this implementation for each - kind of events.\smallskip -*} - -(*<*) -context step_create_cps -begin -(*>*) -text {* - \noindent - \colorbox{mygrey}{@{term "Create th prio"}:} We assume that the current state @{text s'} and - the next state @{term "s \ Create th prio#s'"} are both valid (meaning the event - is allowed to occur). In this situation we can show that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm eq_dep},\\ - @{thm eq_cp_th}, and\\ - @{thm[mode=IfThen] eq_cp} - \end{tabular} - \end{isabelle} - - \noindent - This means in an implementation we do not have recalculate the @{text RAG} and also none of the - current precedences of the other threads. The current precedence of the created - thread @{text th} is just its precedence, namely the pair @{term "(prio, length (s::event list))"}. - \smallskip - *} -(*<*) -end -context step_exit_cps -begin -(*>*) -text {* - \noindent - \colorbox{mygrey}{@{term "Exit th"}:} We again assume that the current state @{text s'} and - the next state @{term "s \ Exit th#s'"} are both valid. We can show that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm eq_dep}, and\\ - @{thm[mode=IfThen] eq_cp} - \end{tabular} - \end{isabelle} - - \noindent - This means again we do not have to recalculate the @{text RAG} and - also not the current precedences for the other threads. Since @{term th} is not - alive anymore in state @{term "s"}, there is no need to calculate its - current precedence. - \smallskip -*} -(*<*) -end -context step_set_cps -begin -(*>*) -text {* - \noindent - \colorbox{mygrey}{@{term "Set th prio"}:} We assume that @{text s'} and - @{term "s \ Set th prio#s'"} are both valid. We can show that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm[mode=IfThen] eq_dep}, and\\ - @{thm[mode=IfThen] eq_cp_pre} - \end{tabular} - \end{isabelle} - - \noindent - The first property is again telling us we do not need to change the @{text RAG}. - The second shows that the @{term cp}-values of all threads other than @{text th} - are unchanged. The reason is that @{text th} is running; therefore it is not in - the @{term dependants} relation of any other thread. This in turn means that the - change of its priority cannot affect other threads. - - %The second - %however states that only threads that are \emph{not} dependants of @{text th} have their - %current precedence unchanged. For the others we have to recalculate the current - %precedence. To do this we can start from @{term "th"} - %and follow the @{term "depend"}-edges to recompute using Lemma~\ref{childrenlem} - %the @{term "cp"} of every - %thread encountered on the way. Since the @{term "depend"} - %is assumed to be loop free, this procedure will always stop. The following two lemmas show, however, - %that this procedure can actually stop often earlier without having to consider all - %dependants. - % - %\begin{isabelle}\ \ \ \ \ %%% - %\begin{tabular}{@ {}l} - %@{thm[mode=IfThen] eq_up_self}\\ - %@{text "If"} @{thm (prem 1) eq_up}, @{thm (prem 2) eq_up} and @{thm (prem 3) eq_up}\\ - %@{text "then"} @{thm (concl) eq_up}. - %\end{tabular} - %\end{isabelle} - % - %\noindent - %The first lemma states that if the current precedence of @{text th} is unchanged, - %then the procedure can stop immediately (all dependent threads have their @{term cp}-value unchanged). - %The second states that if an intermediate @{term cp}-value does not change, then - %the procedure can also stop, because none of its dependent threads will - %have their current precedence changed. - \smallskip - *} -(*<*) -end -context step_v_cps_nt -begin -(*>*) -text {* - \noindent - \colorbox{mygrey}{@{term "V th cs"}:} We assume that @{text s'} and - @{term "s \ V th cs#s'"} are both valid. We have to consider two - subcases: one where there is a thread to ``take over'' the released - resource @{text cs}, and one where there is not. Let us consider them - in turn. Suppose in state @{text s}, the thread @{text th'} takes over - resource @{text cs} from thread @{text th}. We can prove - - - \begin{isabelle}\ \ \ \ \ %%% - @{thm depend_s} - \end{isabelle} - - \noindent - which shows how the @{text RAG} needs to be changed. The next lemma suggests - how the current precedences need to be recalculated. For threads that are - not @{text "th"} and @{text "th'"} nothing needs to be changed, since we - can show - - \begin{isabelle}\ \ \ \ \ %%% - @{thm[mode=IfThen] cp_kept} - \end{isabelle} - - \noindent - For @{text th} and @{text th'} we need to use Lemma~\ref{childrenlem} to - recalculate their current precedence since their children have changed. *}(*<*)end context step_v_cps_nnt begin (*>*)text {* - \noindent - In the other case where there is no thread that takes over @{text cs}, we can show how - to recalculate the @{text RAG} and also show that no current precedence needs - to be recalculated. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm depend_s}\\ - @{thm eq_cp} - \end{tabular} - \end{isabelle} - *} -(*<*) -end -context step_P_cps_e -begin -(*>*) -text {* - \noindent - \colorbox{mygrey}{@{term "P th cs"}:} We assume that @{text s'} and - @{term "s \ P th cs#s'"} are both valid. We again have to analyse two subcases, namely - the one where @{text cs} is not locked, and one where it is. We treat the former case - first by showing that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm depend_s}\\ - @{thm eq_cp} - \end{tabular} - \end{isabelle} - - \noindent - This means we need to add a holding edge to the @{text RAG} and no - current precedence needs to be recalculated.*}(*<*)end context step_P_cps_ne begin(*>*) text {* - \noindent - In the second case we know that resource @{text cs} is locked. We can show that - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - @{thm depend_s}\\ - @{thm[mode=IfThen] eq_cp} - \end{tabular} - \end{isabelle} - - \noindent - That means we have to add a waiting edge to the @{text RAG}. Furthermore - the current precedence for all threads that are not dependants of @{text th} - are unchanged. For the others we need to follow the edges - in the @{text RAG} and recompute the @{term "cp"}. To do this we can start from @{term "th"} - and follow the @{term "depend"}-edges to recompute using Lemma~\ref{childrenlem} - the @{term "cp"} of every - thread encountered on the way. Since the @{term "depend"} - is loop free, this procedure will always stop. The following lemma shows, however, - that this procedure can actually stop often earlier without having to consider all - dependants. - - \begin{isabelle}\ \ \ \ \ %%% - \begin{tabular}{@ {}l} - %%@ {t hm[mode=IfThen] eq_up_self}\\ - @{text "If"} @{thm (prem 1) eq_up}, @{thm (prem 2) eq_up} and @{thm (prem 3) eq_up}\\ - @{text "then"} @{thm (concl) eq_up}. - \end{tabular} - \end{isabelle} - - \noindent - This lemma states that if an intermediate @{term cp}-value does not change, then - the procedure can also stop, because none of its dependent threads will - have their current precedence changed. - *} -(*<*) -end -(*>*) -text {* - \noindent - As can be seen, a pleasing byproduct of our formalisation is that the properties in - this section closely inform an implementation of PIP, namely whether the - @{text RAG} needs to be reconfigured or current precedences need to - be recalculated for an event. This information is provided by the lemmas we proved. - We confirmed that our observations translate into practice by implementing - our version of PIP on top of PINTOS, a small operating system written in C and used for teaching at - Stanford University \cite{PINTOS}. To implement PIP, we only need to modify the kernel - functions corresponding to the events in our formal model. The events translate to the following - function interface in PINTOS: - - \begin{center} - \begin{tabular}{|l@ {\hspace{2mm}}|l@ {\hspace{2mm}}|} - \hline - {\bf Event} & {\bf PINTOS function} \\ - \hline - @{text Create} & @{text "thread_create"}\\ - @{text Exit} & @{text "thread_exit"}\\ - @{text Set} & @{text "thread_set_priority"}\\ - @{text P} & @{text "lock_acquire"}\\ - @{text V} & @{text "lock_release"}\\ - \hline - \end{tabular} - \end{center} - - \noindent - Our implicit assumption that every event is an atomic operation is ensured by the architecture of - PINTOS. The case where an unlocked resource is given next to the waiting thread with the - highest precedence is realised in our implementation by priority queues. We implemented - them as \emph{Braun trees} \cite{Paulson96}, which provide efficient @{text "O(log n)"}-operations - for accessing and updating. Apart from having to implement relatively complex data\-structures in C - using pointers, our experience with the implementation has been very positive: our specification - and formalisation of PIP translates smoothly to an efficent implementation in PINTOS. -*} - -section {* Conclusion *} - -text {* - The Priority Inheritance Protocol (PIP) is a classic textbook - algorithm used in many real-time operating systems in order to avoid the problem of - Priority Inversion. Although classic and widely used, PIP does have - its faults: for example it does not prevent deadlocks in cases where threads - have circular lock dependencies. - - We had two goals in mind with our formalisation of PIP: One is to - make the notions in the correctness proof by Sha et al.~\cite{Sha90} - precise so that they can be processed by a theorem prover. The reason is - that a mechanically checked proof avoids the flaws that crept into their - informal reasoning. We achieved this goal: The correctness of PIP now - only hinges on the assumptions behind our formal model. The reasoning, which is - sometimes quite intricate and tedious, has been checked by Isabelle/HOL. - We can also confirm that Paulson's - inductive method for protocol verification~\cite{Paulson98} is quite - suitable for our formal model and proof. The traditional application - area of this method is security protocols. - - The second goal of our formalisation is to provide a specification for actually - implementing PIP. Textbooks, for example \cite[Section 5.6.5]{Vahalia96}, - explain how to use various implementations of PIP and abstractly - discuss their properties, but surprisingly lack most details important for a - programmer who wants to implement PIP (similarly Sha et al.~\cite{Sha90}). - That this is an issue in practice is illustrated by the - email from Baker we cited in the Introduction. We achieved also this - goal: The formalisation allowed us to efficently implement our version - of PIP on top of PINTOS \cite{PINTOS}, a simple instructional operating system for the x86 - architecture. It also gives the first author enough data to enable - his undergraduate students to implement PIP (as part of their OS course). - A byproduct of our formalisation effort is that nearly all - design choices for the PIP scheduler are backed up with a proved - lemma. We were also able to establish the property that the choice of - the next thread which takes over a lock is irrelevant for the correctness - of PIP. - - PIP is a scheduling algorithm for single-processor systems. We are - now living in a multi-processor world. Priority Inversion certainly - occurs also there. However, there is very little ``foundational'' - work about PIP-algorithms on multi-processor systems. We are not - aware of any correctness proofs, not even informal ones. There is an - implementation of a PIP-algorithm for multi-processors as part of the - ``real-time'' effort in Linux, including an informal description of the implemented scheduling - algorithm given in \cite{LINUX}. We estimate that the formal - verification of this algorithm, involving more fine-grained events, - is a magnitude harder than the one we presented here, but still - within reach of current theorem proving technology. We leave this - for future work. - - The most closely related work to ours is the formal verification in - PVS of the Priority Ceiling Protocol done by Dutertre - \cite{dutertre99b}---another solution to the Priority Inversion - problem, which however needs static analysis of programs in order to - avoid it. There have been earlier formal investigations - into PIP \cite{Faria08,Jahier09,Wellings07}, but they employ model - checking techniques. The results obtained by them apply, - however, only to systems with a fixed size, such as a fixed number of - events and threads. In contrast, our result applies to systems of arbitrary - size. Moreover, our result is a good - witness for one of the major reasons to be interested in machine checked - reasoning: gaining deeper understanding of the subject matter. - - Our formalisation - consists of around 210 lemmas and overall 6950 lines of readable Isabelle/Isar - code with a few apply-scripts interspersed. The formal model of PIP - is 385 lines long; the formal correctness proof 3800 lines. Some auxiliary - definitions and proofs span over 770 lines of code. The properties relevant - for an implementation require 2000 lines. - %The code of our formalisation - %can be downloaded from - %\url{http://www.inf.kcl.ac.uk/staff/urbanc/pip.html}.\medskip - - \noindent - {\bf Acknowledgements:} - We are grateful for the comments we received from anonymous - referees. - - \bibliographystyle{plain} - \bibliography{root} -*} - - -(*<*) -end -(*>*) \ No newline at end of file