notifywhenup.html - s6 - Mirror/fork of https://skarnet.org/software/s6/

notifywhenup.html (8897B)
      1 <html>
      2   <head>
      3     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
      4     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      5     <meta http-equiv="Content-Language" content="en" />
      6     <title>s6: service startup notifications</title>
      7     <meta name="Description" content="s6: service startup notifications" />
      8     <meta name="Keywords" content="s6 ftrig notification notifier writer libftrigw ftrigw startup U up svwait s6-svwait" />
      9     <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
     10   </head>
     11 <body>
     12 
     13 <p>
     14 <a href="index.html">s6</a><br />
     15 <a href="//skarnet.org/software/">Software</a><br />
     16 <a href="//skarnet.org/">skarnet.org</a>
     17 </p>
     18 
     19 <h1> Service startup notifications </h1>
     20 
     21 <p>
     22  It is easy for a process supervision suite to know when a service that was <em>up</em>
     23 is now <em>down</em>: the long-lived process implementing the service is dead. The
     24 supervisor, running as the daemon's parent, is instantly notified via a SIGCHLD.
     25 When it happens, <a href="s6-supervise.html">s6-supervise</a> sends a 'd' event
     26 to its <tt>./event</tt> <a href="fifodir.html">fifodir</a>, so every subscriber
     27 knows that the service is down. All is well.
     28 </p>
     29 
     30 <p>
     31  It is much trickier for a process supervision suite to know when a service
     32 that was <em>down</em> is now <em>up</em>. The supervisor forks and execs the
     33 daemon, and knows when the exec has succeeded; but after that point, it's all
     34 up to the daemon itself. Some daemons do a lot of initialization work before
     35 they're actually ready to serve, and it is impossible for the supervisor to
     36 know exactly <em>when</em> the service is really ready.
     37 <a href="s6-supervise.html">s6-supervise</a> sends a 'u' event to its
     38 <tt>./event</tt> <a href="fifodir.html">fifodir</a> when it successfully
     39 spawns the daemon, but any subscriber
     40 reacting to 'u' is subject to a race condition - the service provided by the
     41 daemon may not be ready yet.
     42 </p>
     43 
     44 <p>
     45  Reliable startup notifications need support from the daemons themselves.
     46 Daemons should do two things to signal the outside world that they are
     47 ready:
     48 </p>
     49 
     50 <ol>
     51  <li> Update a state file, so other processes can get a snapshot
     52 of the daemon's state </li>
     53  <li> Send an event to processes waiting for a state change. </li>
     54 </ol>
     55 
     56 <p>
     57  This is complex to implement in every single daemon, so s6 provides
     58 tools to make it easier for daemon authors, without any need to link
     59 against the s6 library or use any s6-specific construct:
     60  daemons can simply write a line to a file descriptor of their choice,
     61 then close that file descriptor, when they're ready to serve. This is
     62 a generic mechanism that some daemons already implement.
     63 </p>
     64 
     65 <p>
     66  s6 supports that mechanism natively: when the
     67 <a href="servicedir.html">service directory</a> for the daemon contains
     68 a valid <tt>notification-fd</tt> file, the daemon's supervisor, i.e. the
     69 <a href="s6-supervise.html">s6-supervise</a> program, will properly catch
     70 the daemon's message, update the status file (<tt>supervise/status</tt>),
     71 then notify all the subscribers
     72 with a <tt>'U'</tt> event, meaning that the service is now up and ready.
     73 </p>
     74 
     75 <p>
     76  This method should really be implemented in every long-running
     77 program providing a service. When it is not the case, it's impossible
     78 to provide reliable startup notifications, and subscribers should then
     79 be content with the unreliable <tt>'u'</tt> events provided by s6-supervise.
     80 </p>
     81 
     82 <p>
     83 Unfortunately, a lot of long-running programs do not offer that
     84 functionality; instead, they provide a way to poll them, an external
     85 program that runs and checks whether the service is ready. This is a
     86 <a href="//skarnet.org/software/s6/ftrig.html">bad</a> mechanism, for
     87 <a href="//skarnet.org/cgi-bin/archive.cgi?2:mss:1607:dfblejammjllfkggpcph">several</a>
     88 reasons. Nevertheless, until all daemons are patched to notify their
     89 own readiness, s6 provides a way to run such a check program to poll
     90 for readiness, and route its result into the s6 notification system:
     91 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a>.
     92 </p>
     93 
     94 <h2> How to use a check program with s6 (i.e. readiness checking via polling) </h2>
     95 
     96 <ul>
     97  <li> Let's say you have a daemon <em>foo</em>, started under s6 via a
     98 <tt>/run/service/foo</tt> service directory, and that comes with a
     99 <tt>foo-check</tt> program that exhibits different behaviours when
    100 <em>foo</em> is ready and when it is not. </li>
    101  <li> Create an executable script <tt>/run/service/foo/data/check</tt>
    102 that calls <tt>foo-check</tt>. Make sure this script exits 0 when
    103 <em>foo</em> is ready and nonzero when it's not. </li>
    104  <li> In your <tt>/run/service/foo/run</tt> script that starts <em>foo</em>,
    105 instead of executing into <tt>foo</tt>, execute into
    106 <tt>s6-notifyoncheck foo</tt>. Read the
    107 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> page if you need to
    108 give it options to tune the polling. </li>
    109  <li> <tt>echo 3 &gt; /run/service/foo/notification-fd</tt>. If file descriptor
    110 3 is already open when your run script executes <em>foo</em>, replace 3 with
    111 a file descriptor you <em>know</em> is not already open. </li> 
    112  <li> That's it.
    113   <ul>
    114    <li> Your check script will be automatically invoked by
    115 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a>, until it succeeds. </li>
    116    <li> <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> will send the
    117 readiness notification to the file descriptor given in the <tt>notification-fd</tt>
    118 file. </li>
    119    <li> <a href="s6-supervise.html">s6-supervise</a> will receive it and will
    120 mark <em>foo</em> as ready. </li>
    121   </ul> </li>
    122 </ul>
    123 
    124 <h2> How to design a daemon so it uses the s6 mechanism <em>without</em> resorting to polling (i.e. readiness notification) </h2>
    125 
    126 <p>
    127  The <a href="s6-notifyoncheck.html">s6-notifyoncheck</a> mechanism was
    128 made to accommodate daemons that provide a check program but do not notify
    129 readiness themselves; it works, but is suboptimal.
    130  If you are writing the <em>foo</em> daemon, here is how you can make things better:
    131 </p>
    132 
    133 <ul>
    134  <li> Readiness notification should be optional, so you should guard all
    135 the following with a run-time option to <em>foo</em>. </li>
    136  <li> Assume a file descriptor other than 0, 1 or 2 is going to be open.
    137 You can hardcode 3 (or 4); or you can make it configurable via a command line
    138 option. See for instance the <tt>-D <em>notif</em></tt> option to the
    139 <a href="//skarnet.org/software/mdevd/mdevd.html">mdevd</a> program. It
    140 really doesn't matter what this number is; the important thing is that your
    141 daemon knows that this fd is already open, and is not using it for another
    142 purpose. </li>
    143  <li> Do nothing with this file descriptor until your daemon is ready. </li>
    144  <li> When your daemon is ready, write a newline to this file descriptor.
    145   <ul>
    146    <li> If you like, you may write other data before the newline, just in
    147 case it is printed to the terminal. It is not necessary, and it is best to
    148 keep that data short. If the line is read by
    149 <a href="s6-supervise.html">s6-supervise</a>, it will be entirely ignored;
    150 only the newline is important. </li>
    151   </ul>
    152  <li> Then close that file descriptor. </li>
    153 </ul>
    154 
    155 <p>
    156  The user who then makes <em>foo</em> run under s6 just has to do the
    157 following:
    158 </p>
    159 
    160 <ul>
    161  <li> Write 3, or the file descriptor the <em>foo</em> daemon uses
    162 to notify readiness, to the <tt>/run/service/foo/notification-fd</tt> file. </li>
    163  <li> In the <tt>/run/service/foo/run</tt> script, invoke <tt>foo</tt>
    164 with the option that activates the readiness notification. If <em>foo</em>
    165 makes the notification fd configurable, the user needs to make sure that
    166 the number that is given to this option is the same as the number that is
    167 written in the <tt>notification-fd</tt> file. </li>
    168  <li> And that is all. <strong>Do not</strong> use <tt>s6-notifyoncheck</tt>
    169 in this case, because you do not need to poll to know whether <em>foo</em>
    170 is ready; instead, <em>foo</em> will directly communicate its readiness to
    171 <a href="s6-supervise.html">s6-supervise</a>, and that is a much more efficient
    172 mechanism. </li>
    173 </ul>
    174 
    175  <h2> What does <a href="s6-supervise.html">s6-supervise</a> do with this
    176 readiness information? </h2>
    177 
    178 <ul>
    179  <li> <a href="s6-supervise.html">s6-supervise</a> maintains a readiness
    180 state for other programs to read. You can check for it, for instance, via
    181 the <a href="s6-svstat.html">s6-svstat</a> program. </li>
    182  <li> <a href="s6-supervise.html">s6-supervise</a> also broadcasts the
    183 readiness event to programs that are waiting for it - for instance the
    184 <a href="s6-svwait.html">s6-svwait</a> program. This can be used to
    185 make sure that other programs only start when the daemon is ready. For
    186 instance, the
    187 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> service manager uses
    188 that mechanism to bring sets of services up or down: a service starts as
    189 soon as all its dependencies are ready, but never earlier. </li>
    190 </ul>
    191 
    192 </body>
    193 </html>
	s6 Mirror/fork of https://skarnet.org/software/s6/
	git clone https://ccx.te2000.cz/git/s6
	Log \| Files \| Refs \| README \| LICENSE