11 section {* Introduction *} |
11 section {* Introduction *} |
12 |
12 |
13 text {* |
13 text {* |
14 Many real-time systems need to support processes with priorities and |
14 Many real-time systems need to support processes with priorities and |
15 locking of resources. Locking of resources ensures mutual exclusion |
15 locking of resources. Locking of resources ensures mutual exclusion |
16 when accessing shared data or devices. Priorities allow scheduling |
16 when accessing shared data or devices that cannot be |
17 of processes that need to finish their work within deadlines. |
17 preempted. Priorities allow scheduling of processes that need to |
18 Unfortunately, both features can interact in subtle ways leading to |
18 finish their work within deadlines. Unfortunately, both features |
19 a problem, called \emph{Priority Inversion}. Suppose three processes |
19 can interact in subtle ways leading to a problem, called |
20 having priorities $H$(igh), $M$(edium) and $L$(ow). We would expect |
20 \emph{Priority Inversion}. Suppose three processes having priorities |
21 that the process $H$ blocks any other process with lower priority |
21 $H$(igh), $M$(edium) and $L$(ow). We would expect that the process |
22 and itself cannot be blocked by a process with lower priority. Alas, |
22 $H$ blocks any other process with lower priority and itself cannot |
23 in a naive implementation of resource looking and priorities this |
23 be blocked by any process with lower priority. Alas, in a naive |
24 property can be violated. Even worse, $H$ can be delayed |
24 implementation of resource looking and priorities this property can |
25 indefinitely by processes with lower priorities. For this let $L$ be |
25 be violated. Even worse, $H$ can be delayed indefinitely by |
26 in the possession of a lock for a resource that also $H$ needs. $H$ |
26 processes with lower priorities. For this let $L$ be in the |
27 must therefore wait for $L$ to release this lock. The problem is |
27 possession of a lock for a resource that also $H$ needs. $H$ must |
28 that $L$ might in turn be blocked by any process with priority $M$, |
28 therefore wait for $L$ to exit the critical section and release this |
29 and so $H$ sits there potentially waiting indefinitely. Since $H$ is |
29 lock. The problem is that $L$ might in turn be blocked by any |
30 blocked by processes with lower priorities, the problem is called |
30 process with priority $M$, and so $H$ sits there potentially waiting |
31 Priority Inversion. It was first described in |
31 indefinitely. Since $H$ is blocked by processes with lower |
32 \cite{Lampson:Redell:cacm:1980} in the context of the Mesa |
32 priorities, the problem is called Priority Inversion. It was first |
33 programming language designed for concurrent programming. |
33 described in \cite{Lampson:Redell:cacm:1980} in the context of the |
|
34 Mesa programming language designed for concurrent programming. |
34 |
35 |
35 If the problem of Priority Inversion is ignored, real-time systems |
36 If the problem of Priority Inversion is ignored, real-time systems |
36 can become unpredictable and resulting bugs can be hard to diagnose. |
37 can become unpredictable and resulting bugs can be hard to diagnose. |
37 The classic example where this happened is the software that |
38 The classic example where this happened is the software that |
38 controlled the Mars Pathfinder mission in 1997 |
39 controlled the Mars Pathfinder mission in 1997 |
39 \cite{Reeves-Glenn-1998}. Once the spacecraft landed, the software |
40 \cite{Reeves-Glenn-1998}. Once the spacecraft landed, the software |
40 shut down at irregular intervals leading to loss of project time, as |
41 shut down at irregular intervals leading to loss of project time as |
41 normal operation of the craft could only resume the next day (the |
42 normal operation of the craft could only resume the next day (the |
42 mission and data already collected were fortunately not lost, because |
43 mission and data already collected were fortunately not lost, because |
43 of a clever system design). The reason for the shutdowns was that |
44 of a clever system design). The reason for the shutdowns was that |
44 the scheduling software fell victim of Priority Inversion: a low |
45 the scheduling software fell victim of Priority Inversion: a low |
45 priority task locking a resource prevented a high priority process |
46 priority task locking a resource prevented a high priority process |
46 from running in time leading to a system reset. Once the problem was found, |
47 from running in time leading to a system reset. Once the problem was found, |
47 it was rectified by enabling the Priority Inheritance Protocol in |
48 it was rectified by enabling the \emph{Priority Inheritance Protocol} |
48 the scheduling software. |
49 (PIP) \cite{journals/tc/ShaRL90} in the scheduling software. |
49 |
50 |
50 The idea behind the \emph{Priority Inheritance Protocol} (PIP) |
51 The idea behind PIP is to let the process $L$ temporarily |
51 \cite{journals/tc/ShaRL90} is to let the process $L$ temporarily |
52 inherit the high priority from $H$ until $L$ leaves the critical |
52 inherit the high priority from $H$ until $L$ releases the locked |
53 section unlocking the resource. This solves the problem of $H$ |
53 resource. This solves the problem of $H$ having to wait |
54 having to wait indefinitely, because $L$ cannot, for example, be |
54 indefinitely, because $L$ cannot, for example, be blocked by |
55 blocked by processes having priority $M$. While a few other |
55 processes having priority $M$. While a few other solutions exist for the |
56 solutions exist for the Priority Inversion problem |
56 Priority Inversion problem \cite{Lampson:Redell:cacm:1980}, |
57 \cite{Lampson:Redell:cacm:1980}, PIP is one that is widely deployed |
57 PIP is one that is widely deployed and implemented, including in |
58 and implemented. This includes VxWorks (a proprietary real-time OS |
58 VxWorks (a proprietary real-time OS used in the Mars Pathfinder |
59 used in the Mars Pathfinder mission, in Boeing's 787 Dreamliner, |
59 mission, in Boeing's 787 Dreamliner, Honda's ASIMO robot, etc.), but |
60 Honda's ASIMO robot, etc.), but also the POSIX 1003.1c Standard |
60 also as POSIX 1003.1c Standard, realised for example in libraries |
61 realised for example in libraries for FreeBSD, Solaris and Linux. |
61 for FreeBSD, Solaris and Linux. One advantage of PIP is that |
62 |
62 increasing the priority of a process can be dynamically |
63 One advantage of PIP is that increasing the priority of a process |
63 calculated. This is in contrast to, for example, \emph{Priority Ceiling}---another |
64 can be dynamically calculated by the scheduler. This is in contrast |
64 solution to the Priority Inversion problem, which however |
65 to, for example, \emph{Priority Ceiling}, another solution to the |
65 requires static analysis of the program in order to be helpful. |
66 Priority Inversion problem, which requires static analysis of the |
66 |
67 program in order to be helpful. However, there has also been strong |
67 However there has also been strong criticism against using PIP. For |
68 criticism against PIP. For instance, PIP cannot prevent deadlocks, |
68 example it cannot prevent deathlocks and also blocking times can be |
69 and also blocking times can be substantial (more than just the |
69 substantial (more than just the duration of a critical section). |
70 duration of a critical section). Though, most criticism against PIP |
70 However, most criticism of PIP centres around unreliabale |
71 centres around unreliable implementations and around PIP being |
71 implementations of PIP and PIP being complex. For example, Y...writes: |
72 too complicated and too inefficient. For example, Yodaiken writes in \cite{Yodaiken02}: |
72 |
73 |
73 \begin{quote} |
74 \begin{quote} |
74 \it{}``Priority inheritance is neither efficient nor reliable. Implementations |
75 \it{}``Priority inheritance is neither efficient nor reliable. Implementations |
75 are either incomplete (and unreliable) or surprisingly complex and intrusive.'' |
76 are either incomplete (and unreliable) or surprisingly complex and intrusive.'' |
76 \end{quote} |
77 \end{quote} |
77 |
78 |
78 \noindent |
79 \noindent |
79 His solution is to avoid PIP altogether by not allowing critical sections to be |
80 He suggests to avoid PIP altogether by not allowing critical |
80 pre-empted. While this might have been a sensible solution in 198..., in our |
81 sections to be preempted. While this was a sensible solution in |
81 modern multiprocessor world, this seems out of date. |
82 2002, in our modern multiprocessor world, this seems out of date. |
82 While there exists |
83 However, there is clearly a need for investigating correct and |
83 That there is a practical need |
84 efficient algorithms for PIP. A few specifications for PIP exist (in |
84 |
85 English) and also a few high-level descriptions of implementations |
85 Baker wrote on 13 July 2009 in the Linux Kernel mailing list: |
86 (e.g.~in the textbook \cite[Section 5.6.5]{Vahalia96}), but they help little with |
|
87 actual implementations. That this is a problem in practise is proved |
|
88 by an email from Baker, who wrote on 13 July 2009 on the Linux |
|
89 Kernel mailing list: |
86 |
90 |
87 \begin{quote} |
91 \begin{quote} |
88 \it{}``I observed in the kernel code |
92 \it{}``I observed in the kernel code (to my disgust), the Linux PIP |
89 (to my disgust), the Linux PIP implementation is a nightmare: |
93 implementation is a nightmare: extremely heavy weight, involving |
90 extremely heavy weight, involving maintenance of a full wait-for |
94 maintenance of a full wait-for graph, and requiring updates for a |
91 graph, and requiring updates for a range of events, including |
95 range of events, including priority changes and interruptions of |
92 priority changes and interruptions of wait operations.'' |
96 wait operations.'' |
93 \end{quote} |
97 \end{quote} |
94 |
98 |
95 \noindent |
99 \noindent |
96 This however means it is useful to look at PIP again from a more |
100 This however means it is useful to look at PIP again from a more |
97 abstract level (but still concrete enough to inform an efficient implementation) |
101 abstract level (but still concrete enough to inform an |
98 and makes it an ideal candidate for a formal verification. One reason |
102 implementation) and makes PIP an ideal candidate for a formal |
99 is of course that the original presentation of PIP, including a correcness ``proof'', |
103 verification. One reason, of course, is that the original |
100 is flawed. Y... points out a subletly that has been overlooked by Sha et al. |
104 presentation of PIP, despite being informally ``proved'' correct, is |
101 But this is too simplistic. Consider |
105 flawed. Yodaiken \cite{Yodaiken02} points to a subtlety that has |
102 |
106 been overlooked by Sha et al. |
103 |
107 |
104 |
108 But this is too |
|
109 simplistic. Consider |
105 Priority Inversion problem has been known since 1980 |
110 Priority Inversion problem has been known since 1980 |
106 \cite{Lampson:Redell:cacm:1980}, but Sha et al.~give the first |
111 \cite{Lampson:Redell:cacm:1980}, but Sha et al.~give the first |
107 thorough analysis and present an informal correctness proof for PIP |
112 thorough analysis and present an informal correctness proof for PIP |
108 .\footnote{Sha et al.~call it the |
113 .\footnote{Sha et al.~call it the |
109 \emph{Basic Priority Inheritance Protocol}.} |
114 \emph{Basic Priority Inheritance Protocol}.} |