notifywhenup.html (8897B)
1 <html> 2 <head> 3 <meta name="viewport" content="width=device-width, initial-scale=1.0" /> 4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 5 <meta http-equiv="Content-Language" content="en" /> 6 <title>s6: service startup notifications</title> 7 <meta name="Description" content="s6: service startup notifications" /> 8 <meta name="Keywords" content="s6 ftrig notification notifier writer libftrigw ftrigw startup U up svwait s6-svwait" /> 9 <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> --> 10 </head> 11 <body> 12 13 <p> 14 <a href="index.html">s6</a><br /> 15 <a href="//skarnet.org/software/">Software</a><br /> 16 <a href="//skarnet.org/">skarnet.org</a> 17 </p> 18 19 <h1> Service startup notifications </h1> 20 21 <p> 22 It is easy for a process supervision suite to know when a service that was <em>up</em> 23 is now <em>down</em>: the long-lived process implementing the service is dead. The 24 supervisor, running as the daemon's parent, is instantly notified via a SIGCHLD. 25 When it happens, <a href="s6-supervise.html">s6-supervise</a> sends a 'd' event 26 to its <tt>./event</tt> <a href="fifodir.html">fifodir</a>, so every subscriber 27 knows that the service is down. All is well. 28 </p> 29 30 <p> 31 It is much trickier for a process supervision suite to know when a service 32 that was <em>down</em> is now <em>up</em>. The supervisor forks and execs the 33 daemon, and knows when the exec has succeeded; but after that point, it's all 34 up to the daemon itself. Some daemons do a lot of initialization work before 35 they're actually ready to serve, and it is impossible for the supervisor to 36 know exactly <em>when</em> the service is really ready. 37 <a href="s6-supervise.html">s6-supervise</a> sends a 'u' event to its 38 <tt>./event</tt> <a href="fifodir.html">fifodir</a> when it successfully 39 spawns the daemon, but any subscriber 40 reacting to 'u' is subject to a race condition - the service provided by the 41 daemon may not be ready yet. 42 </p> 43 44 <p> 45 Reliable startup notifications need support from the daemons themselves. 46 Daemons should do two things to signal the outside world that they are 47 ready: 48 </p> 49 50 <ol> 51 <li> Update a state file, so other processes can get a snapshot 52 of the daemon's state </li> 53 <li> Send an event to processes waiting for a state change. </li> 54 </ol> 55 56 <p> 57 This is complex to implement in every single daemon, so s6 provides 58 tools to make it easier for daemon authors, without any need to link 59 against the s6 library or use any s6-specific construct: 60 daemons can simply write a line to a file descriptor of their choice, 61 then close that file descriptor, when they're ready to serve. This is 62 a generic mechanism that some daemons already implement. 63 </p> 64 65 <p> 66 s6 supports that mechanism natively: when the 67 <a href="servicedir.html">service directory</a> for the daemon contains 68 a valid <tt>notification-fd</tt> file, the daemon's supervisor, i.e. the 69 <a href="s6-supervise.html">s6-supervise</a> program, will properly catch 70 the daemon's message, update the status file (<tt>supervise/status</tt>), 71 then notify all the subscribers 72 with a <tt>'U'</tt> event, meaning that the service is now up and ready. 73 </p> 74 75 <p> 76 This method should really be implemented in every long-running 77 program providing a service. When it is not the case, it's impossible 78 to provide reliable startup notifications, and subscribers should then 79 be content with the unreliable <tt>'u'</tt> events provided by s6-supervise. 80 </p> 81 82 <p> 83 Unfortunately, a lot of long-running programs do not offer that 84 functionality; instead, they provide a way to poll them, an external 85 program that runs and checks whether the service is ready. This is a 86 <a href="//skarnet.org/software/s6/ftrig.html">bad</a> mechanism, for 87 <a href="//skarnet.org/cgi-bin/archive.cgi?2:mss:1607:dfblejammjllfkggpcph">several</a> 88 reasons. Nevertheless, until all daemons are patched to notify their 89 own readiness, s6 provides a way to run such a check program to poll 90 for readiness, and route its result into the s6 notification system: 91 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a>. 92 </p> 93 94 <h2> How to use a check program with s6 (i.e. readiness checking via polling) </h2> 95 96 <ul> 97 <li> Let's say you have a daemon <em>foo</em>, started under s6 via a 98 <tt>/run/service/foo</tt> service directory, and that comes with a 99 <tt>foo-check</tt> program that exhibits different behaviours when 100 <em>foo</em> is ready and when it is not. </li> 101 <li> Create an executable script <tt>/run/service/foo/data/check</tt> 102 that calls <tt>foo-check</tt>. Make sure this script exits 0 when 103 <em>foo</em> is ready and nonzero when it's not. </li> 104 <li> In your <tt>/run/service/foo/run</tt> script that starts <em>foo</em>, 105 instead of executing into <tt>foo</tt>, execute into 106 <tt>s6-notifyoncheck foo</tt>. Read the 107 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> page if you need to 108 give it options to tune the polling. </li> 109 <li> <tt>echo 3 > /run/service/foo/notification-fd</tt>. If file descriptor 110 3 is already open when your run script executes <em>foo</em>, replace 3 with 111 a file descriptor you <em>know</em> is not already open. </li> 112 <li> That's it. 113 <ul> 114 <li> Your check script will be automatically invoked by 115 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a>, until it succeeds. </li> 116 <li> <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> will send the 117 readiness notification to the file descriptor given in the <tt>notification-fd</tt> 118 file. </li> 119 <li> <a href="s6-supervise.html">s6-supervise</a> will receive it and will 120 mark <em>foo</em> as ready. </li> 121 </ul> </li> 122 </ul> 123 124 <h2> How to design a daemon so it uses the s6 mechanism <em>without</em> resorting to polling (i.e. readiness notification) </h2> 125 126 <p> 127 The <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> mechanism was 128 made to accommodate daemons that provide a check program but do not notify 129 readiness themselves; it works, but is suboptimal. 130 If you are writing the <em>foo</em> daemon, here is how you can make things better: 131 </p> 132 133 <ul> 134 <li> Readiness notification should be optional, so you should guard all 135 the following with a run-time option to <em>foo</em>. </li> 136 <li> Assume a file descriptor other than 0, 1 or 2 is going to be open. 137 You can hardcode 3 (or 4); or you can make it configurable via a command line 138 option. See for instance the <tt>-D <em>notif</em></tt> option to the 139 <a href="//skarnet.org/software/mdevd/mdevd.html">mdevd</a> program. It 140 really doesn't matter what this number is; the important thing is that your 141 daemon knows that this fd is already open, and is not using it for another 142 purpose. </li> 143 <li> Do nothing with this file descriptor until your daemon is ready. </li> 144 <li> When your daemon is ready, write a newline to this file descriptor. 145 <ul> 146 <li> If you like, you may write other data before the newline, just in 147 case it is printed to the terminal. It is not necessary, and it is best to 148 keep that data short. If the line is read by 149 <a href="s6-supervise.html">s6-supervise</a>, it will be entirely ignored; 150 only the newline is important. </li> 151 </ul> 152 <li> Then close that file descriptor. </li> 153 </ul> 154 155 <p> 156 The user who then makes <em>foo</em> run under s6 just has to do the 157 following: 158 </p> 159 160 <ul> 161 <li> Write 3, or the file descriptor the <em>foo</em> daemon uses 162 to notify readiness, to the <tt>/run/service/foo/notification-fd</tt> file. </li> 163 <li> In the <tt>/run/service/foo/run</tt> script, invoke <tt>foo</tt> 164 with the option that activates the readiness notification. If <em>foo</em> 165 makes the notification fd configurable, the user needs to make sure that 166 the number that is given to this option is the same as the number that is 167 written in the <tt>notification-fd</tt> file. </li> 168 <li> And that is all. <strong>Do not</strong> use <tt>s6-notifyoncheck</tt> 169 in this case, because you do not need to poll to know whether <em>foo</em> 170 is ready; instead, <em>foo</em> will directly communicate its readiness to 171 <a href="s6-supervise.html">s6-supervise</a>, and that is a much more efficient 172 mechanism. </li> 173 </ul> 174 175 <h2> What does <a href="s6-supervise.html">s6-supervise</a> do with this 176 readiness information? </h2> 177 178 <ul> 179 <li> <a href="s6-supervise.html">s6-supervise</a> maintains a readiness 180 state for other programs to read. You can check for it, for instance, via 181 the <a href="s6-svstat.html">s6-svstat</a> program. </li> 182 <li> <a href="s6-supervise.html">s6-supervise</a> also broadcasts the 183 readiness event to programs that are waiting for it - for instance the 184 <a href="s6-svwait.html">s6-svwait</a> program. This can be used to 185 make sure that other programs only start when the daemon is ready. For 186 instance, the 187 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> service manager uses 188 that mechanism to bring sets of services up or down: a service starts as 189 soon as all its dependencies are ready, but never earlier. </li> 190 </ul> 191 192 </body> 193 </html>