s6

Mirror/fork of https://skarnet.org/software/s6/
git clone https://ccx.te2000.cz/git/s6
Log | Files | Refs | README | LICENSE

servicedir.html (18408B)


      1 <html>
      2   <head>
      3     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
      4     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      5     <meta http-equiv="Content-Language" content="en" />
      6     <title>s6: service directories</title>
      7     <meta name="Description" content="s6: service directory" />
      8     <meta name="Keywords" content="s6 supervision supervise service directory run finish servicedir" />
      9     <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
     10   </head>
     11 <body>
     12 
     13 <p>
     14 <a href="index.html">s6</a><br />
     15 <a href="//skarnet.org/software/">Software</a><br />
     16 <a href="//skarnet.org/">skarnet.org</a>
     17 </p>
     18 
     19 <h1> Service directories </h1>
     20 
     21 <p>
     22  A <em>service directory</em> is a directory containing all the information
     23 related to a <em>service</em>, i.e. a long-running process maintained and
     24 supervised by <a href="s6-supervise.html">s6-supervise</a>.
     25 </p>
     26 
     27 <p>
     28  (Strictly speaking, a <em>service</em> is not always equivalent to a
     29 long-running process. Things like Ethernet interfaces fit the definition
     30 of <em>services</em> one may want to supervise; however, s6 does not
     31 provide <em>service supervision</em>; it provides <em>process supervision</em>,
     32 and it is impractical to use the s6 architecture as is to supervise
     33 services that are not equivalent to one long-running process. However,
     34 we still use the terms <em>service</em> and <em>service directory</em>
     35 for historical and compatibility reasons.)
     36 </p>
     37 
     38 <h2> Contents </h2>
     39 
     40  A service directory <em>foo</em> may contain the following elements:
     41 
     42 <ul>
     43  <li style="margin-bottom:1em"> An executable file named <tt>run</tt>. It can be any executable
     44 file (such as a binary file or a link to any other executable file),
     45 but most of the time it will be a script, called <em>run script</em>.
     46 This file is the most important one in your service directory: it
     47 contains the commands that will setup and run your <em>foo</em> service.
     48  <ul>
     49   <li> It is forked and executed by <a href="s6-supervise.html">s6-supervise</a>
     50 every time the service must be started, i.e. normally when
     51 <a href="s6-supervise.html">s6-supervise</a> starts, and whenever
     52 the service goes down when it is supposed to be up. </li>
     53   <li> It is given one argument, which is the same argument that the
     54 <a href="s6-supervise.html">s6-supervise</a> process is running with,
     55 i.e. the name of the service directory &mdash; or, if
     56 <a href="s6-supervise.html">s6-supervise</a> is run under
     57 <a href="s6-svscan.html">s6-svscan</a>, the name of the service directory
     58 as seen by <a href="s6-svscan.html">s6-svscan</a> in its
     59 <a href="scandir.html">scan directory</a>. That is, <tt><em>foo</em></tt>
     60 or <tt><em>foo</em>/log</tt>, if <em>foo</em> is the name of the
     61 <em>symbolic link</em> in the scan directory. </li> </ul>
     62 
     63 <p> A run script should normally: </p>
     64  <ul>
     65   <li> adjust redirections for stdin, stdout and stderr. When a run
     66 script starts, it inherits its standard file descriptors from
     67 <a href="s6-supervise.html">s6-supervise</a>, which itself inherits them from
     68 <a href="s6-svscan.html">s6-svscan</a>. stdin is normally <tt>/dev/null</tt>.
     69 If s6-svscan was launched by another init system, stdout and stderr likely
     70 point to that init system's default log (or <tt>/dev/null</tt> in the case
     71 of sysvinit). If s6-svscan is running as pid 1 via the help of software like
     72 <a href="//skarnet.org/software/s6-linux-init/">s6-linux-init</a>, then its
     73 stdout and stderr point to a <em>catch-all logger</em>, which catches and
     74 logs any output of the supervision tree that has not been caught by a
     75 dedicated logger. If the defaults provided by your installation are not
     76 suitable for your run script, then your run script should perform the proper
     77 redirections before executing into the final daemon. For instance, dedicated
     78 logging mechanisms, such as the <tt>log</tt> subdirectory (see below) or the
     79 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> pipeline feature, pipe your
     80 run script's <em>stdout</em> to the logging service, but chances are you want
     81 to log <em>stderr</em> as well, so the run script should make sure that its
     82 stderr goes into the log pipe. This
     83 is achieved by <tt><a href="//skarnet.org/software/execline/fdmove.html">fdmove</a>
     84 -c 2 1</tt> in <a href="//skarnet.org/software/execline/">execline</a>,
     85 and <tt>exec 2&gt;&amp;1</tt> in <a href="https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html">shell</a>.
     86  </li>
     87 <li> adjust the environment for your <em>foo</em> daemon. Normally the run script
     88 inherits its environment from <a href="s6-supervise.html">s6-supervise</a>,
     89 which normally inherits its environment from <a href="s6-svscan.html">s6-svscan</a>,
     90 which normally inherits a minimal environment from the boot scripts.
     91 Service-specific environment variables should be set in the run script. </li>
     92  <li> adjust other parameters for the <em>foo</em> daemon, such as its
     93 uid and gid. Normally the supervision tree, i.e.
     94 <a href="s6-svscan.html">s6-svscan</a> and the various
     95 <a href="s6-supervise.html">s6-supervise</a> processes, is run as root, so
     96 run scripts are also run as root; however, for security purposes, services
     97 should not run as root if they don't need to. You can use the
     98 <a href="s6-setuidgid.html">s6-setuidgid</a> utility in <em>foo</em><tt>/run</tt>
     99 to lose privileges before executing into <em>foo</em>'s long-lived
    100 process; or the <a href="s6-envuidgid.html">s6-envuidgid</a> utility if
    101 your long-lived process needs root privileges at start time but can drop
    102 them afterwards. </li>
    103  <li> execute into the long-lived process that is to be supervised by
    104 <a href="s6-supervise.html">s6-supervise</a>, i.e. the real <em>foo</em>
    105 daemon. That process must not "background itself": being run by a supervision
    106 tree already makes it a "background" task. </li>
    107  </ul> </li>
    108 
    109  <li style="margin-bottom:1em"> An optional executable file named <tt>finish</tt>. Like <tt>run</tt>,
    110 it can be any executable file. This <em>finish script</em>, if present,
    111 is executed everytime the <tt>run</tt> script dies. Generally, its main
    112 purpose is to clean up non-volatile data such as the filesystem after the supervised
    113 process has been killed. If the <em>foo</em> service is supposed to be up,
    114 <em>foo</em><tt>/run</tt> is restarted after <em>foo</em><tt>/finish</tt> dies.
    115  <ul>
    116   <li> By default, a finish script must do its work and exit in less than                              
    117 5 seconds; if it takes more than that, it is killed. (The point is that the run
    118 script, not the finish script, should be running; the finish script should really
    119 be short-lived.) The maximum duration of a <tt>finish</tt> execution can be
    120 configured via the <tt>timeout-finish</tt> file, see below. </li>
    121   <li> The finish script is executed with three arguments:
    122    <ol>
    123     <li> the exit code from the run script (resp. 256 if the run script was killed by a signal) </li>
    124     <li> an undefined number (resp. the number of the signal that killed the run script) </li>
    125     <li> the name of the service directory, the same that has been given to <tt>./run</tt>. </li>
    126    </ol>
    127   <li> If the finish script exits 125, then <a href="s6-supervise.html">s6-supervise</a>
    128 interprets this as a permanent failure for the service, and does not restart it,
    129 as if an <a href="s6-svc.html">s6-svc -O</a> command had been sent. </li>
    130   <li> If <a href="s6-supervise.html">s6-supervise</a> has been instructed to exit after
    131 the service dies, via a <tt>s6-svc -x</tt> command or a SIGHUP, then the next
    132 invocation of <tt>finish</tt> will (obviously) be the last, and it will run with
    133 stdin and stdout pointing to <tt>/dev/null</tt>. </li>
    134  </ul> </li>
    135 
    136  <li style="margin-bottom:1em"> A directory named <tt>supervise</tt>. It is automatically created by
    137 <a href="s6-supervise.html">s6-supervise</a> if it does not exist. This is where
    138 <a href="s6-supervise.html">s6-supervise</a> stores its internal information.
    139 The directory must be writable. </li>
    140 
    141  <li style="margin-bottom:1em"> An optional, empty, regular file named <tt>down</tt>. If such a file exists,
    142 the default state of the service is considered down, not up: s6-supervise will not
    143 automatically start it until it receives a <tt>s6-svc -u</tt> command. If no
    144 <tt>down</tt> file exists, the default state of the service is up. </li>
    145 
    146  <li style="margin-bottom:1em"> An optional regular file named <tt>notification-fd</tt>. If such a file
    147 exists, it means that the service supports
    148 <a href="notifywhenup.html">readiness notification</a>. The file must only
    149  contain an unsigned integer, which is the number of the file descriptor that
    150 the service writes its readiness notification to. (For instance, it should
    151 be 1 if the daemon is <a href="s6-ipcserverd.html">s6-ipcserverd</a> run with the
    152 <tt>-1</tt> option.)
    153   When a service is started, or restarted, by s6-supervise, if this file
    154 exists and contains a valid descriptor number, s6-supervise will wait for the
    155 notification from the service and broadcast readiness, i.e. any
    156 <a href="s6-svwait.html">s6-svwait -U</a>,
    157 <a href="s6-svlisten1.html">s6-svlisten1 -U</a> or
    158 <a href="s6-svlisten.html">s6-svlisten -U</a> processes will be
    159 triggered. </li>
    160 
    161  <li style="margin-bottom:1em"> An optional regular file named <tt>lock-fd</tt>. If such a file
    162 exists, it must contain an unsigned integer, representing a file descriptor that
    163 will be open in the service. The service <em>should not write to that descriptor</em>
    164 and <em>should not close it</em>. In other words, it should totally ignore it. That
    165 file descriptor holds a lock, that will naturally be released when the service dies.
    166 The point of this feature is to prevent s6-supervise from accidentally spawning several
    167 copies of the service in case something goes wrong: for instance, the service
    168 backgrounds itself (which it shouldn't do when running under a supervision suite), or
    169 s6-supervise is killed, restarted by s6-svscan, and attempts to start another copy of
    170 the service while the first copy is still alive. If s6-supervise detects that the lock
    171 is held when it tries to start the service, it will print a warning message; the new
    172 service instance will block until the lock is released, then proceed as usual. </li>
    173 
    174  <li style="margin-bottom:1em"> An optional regular file named <tt>timeout-kill</tt>. If such a file
    175 exists, it must only contain an unsigned integer <em>t</em>. If <em>t</em>
    176 is nonzero, then on receipt of an <a href="s6-svc.html">s6-svc -d</a> command,
    177 which sends a SIGTERM (by default, see <tt>down-signal</tt> below) and a
    178 SIGCONT to the service, a timeout of <em>t</em>
    179 milliseconds is set; and if the service is still not dead after <em>t</em>
    180 milliseconds, then it is sent a SIGKILL. If <tt>timeout-kill</tt> does not
    181 exist, or contains 0 or an invalid value, then the service is never
    182 forcibly killed (unless, of course, an <a href="s6-svc.html">s6-svc -k</a>
    183 command is sent). </li>
    184 
    185  <li style="margin-bottom:1em"> An optional regular file named <tt>timeout-finish</tt>. If such a file
    186 exists, it must only contain an unsigned integer, which is the number of
    187 milliseconds after which the <tt>./finish</tt> script, if it exists, will
    188 be killed with a SIGKILL. The default is 5000: finish scripts are killed
    189 if they're still alive after 5 seconds. A value of 0 allows finish scripts
    190 to run forever. </li>
    191 
    192  <li style="margin-bottom:1em"> An optional regular file named <tt>max-death-tally</tt>. If such a file
    193 exists, it must only contain an unsigned integer, which is the maximum number of
    194 service death events that s6-supervise will keep track of. If the service dies
    195 more than this number of times, the oldest events will be forgotten. Tracking
    196 death events is useful, for instance, when throttling service restarts. The
    197 value cannot be greater than 4096. If the file does not exist, a default of 100
    198 is used. </li>
    199 
    200  <li style="margin-bottom:1em"> An optional regular file named <tt>down-signal</tt>. If such a file
    201 exists, it must only contain the name or number of a signal, followed by a
    202 newline. This signal will be used to kill the supervised process when a
    203 <a href="s6-svc.html">s6-svc -d</a> or <a href="s6-svc.html">s6-svc -r</a>
    204 command is used. If the file does not exist, SIGTERM will be used by default. </li>
    205 
    206  <li style="margin-bottom:1em"> A <a href="fifodir.html">fifodir</a> named <tt>event</tt>. It is automatically
    207 created by <a href="s6-supervise.html">s6-supervise</a> if it does not exist.
    208 <em>foo</em><tt>/event</tt>
    209 is the rendez-vous point for listeners, where <a href="s6-supervise.html">s6-supervise</a>
    210 will send notifications when the service goes up or down. </li>
    211 
    212  <li style="margin-bottom:1em"> Optional directories named <tt>instance</tt>
    213 and <tt>instances</tt>. Those are internal subdirectories created by
    214 <a href="s6-instance-maker.html">s6-instance maker</a> in a templated service
    215 directory. Outside of instanced services, these directories should never
    216 appear, and you should never create them manually. </li>
    217 
    218 
    219  <li style="margin-bottom:1em"> An optional service directory named <tt>log</tt>. If it exists and <em>foo</em>
    220 is in a <a href="scandir.html">scandir</a>, and <a href="s6-svscan.html">s6-svscan</a>
    221 runs on that scandir, then <em>two</em> services are monitored: <em>foo</em> and
    222 <em>foo</em><tt>/log</tt>. A pipe is open and maintained between <em>foo</em> and
    223 <em>foo</em><tt>/log</tt>, i.e. everything that <em>foo</em><tt>/run</tt>
    224 writes to its stdout will appear on <em>foo</em><tt>/log/run</tt>'s stdin. The <em>foo</em>
    225 service is said to be <em>logged</em>; the <em>foo</em><tt>/log</tt> service is called
    226 <em>foo</em>'s <em>logger</em>. A logger service cannot be logged: if
    227 <em>foo</em><tt>/log/log</tt> exists, nothing special happens. </li>
    228 </ul>
    229 
    230  <h3> Stability </h3>
    231 
    232 <p>
    233  With the evolution of s6, it is possible that 
    234  <a href="s6-supervise.html">s6-supervise</a> configuration uses more and more
    235 files in the service directory. The
    236 <tt>notification-fd</tt> and <tt>timeout-finish</tt> files, for
    237 instance, have appeared in 2015; users who previously had files
    238 with the same name had to change them. There is no guarantee that
    239 <a href="s6-supervise.html">s6-supervise</a> will not use additional
    240 names in the service directory in the same fashion in the future.
    241 </p>
    242 
    243 <p>
    244  There <em>is</em>, however, a guarantee that
    245 <a href="s6-supervise.html">s6-supervise</a> will never touch
    246 subdirectories named <tt>data</tt> or <tt>env</tt>. So if you
    247 need to store user information in the service directory with
    248 the guarantee that it will never be mistaken for a configuration
    249 file, no matter the version of s6, you should store that information in
    250 the <tt>data</tt> or <tt>env</tt> subdirectories of the service
    251 directory.
    252 </p>
    253 
    254 <a name="where">
    255  <h2> Where should I store my service directories? </h2>
    256 </a>
    257 
    258 <p>
    259  Service directories describe the way services are launched. Once they are
    260 designed, they have little reason to change on a given machine. They can
    261 theoretically reside on a read-only filesystem - for instance, the root
    262 filesystem, to avoid problems with mounting failures.
    263 </p>
    264 
    265 <p>
    266  However, two subdirectories - namely <tt>supervise</tt> and <tt>event</tt> -
    267 of every service directory need to be writable. So it has to be a bit more
    268 complex. Here are a few possibilities.
    269 </p>
    270 
    271 <ul>
    272  <li> The laziest option: you're not using <a href="s6-svscan.html">s6-svscan</a>
    273 as process 1, you're only using it to start a collection of services, and
    274 your booting process is already handled by another init system. Then you can
    275 just store your service directories and your <a href="scandir.html">scan
    276 directory</a> on some read-write filesystem such as <tt>/var</tt>; and you
    277 tell your init system to launch (and, if possible, maintain) s6-svscan on
    278 the scan directory after that filesystem is mounted. </li>
    279  <li> The almost-as-lazy option: just have the service directories on the
    280 root filesystem. Then your service directory collection is for instance in
    281 <tt>/etc/services</tt> and you have a <tt>/service</tt>
    282 <a href="scandir.html">scan directory</a> containing symlinks to that
    283 collection. This is the easy setup, not requiring an external init system
    284 to mount your filesystems - however, it requires your root filesystem to be
    285 read-write, which is unacceptable if you are concerned with reliability - if
    286 you are, for instance, designing an embedded platform. </li>
    287  <li> <a href="https://code.dogmap.org/">Some people</a> like to have
    288 their service directories in a read-only filesystem, with <tt>supervise</tt>
    289 symlinks pointing to various places in writable filesystems. This setup looks
    290 a bit complex to me: it requires careful handling of the writable
    291 filesystems, with not much room for error if the directory structure does not
    292 match the symlinks (which are then dangling). But it works. </li>
    293  <li> Service directories are usually small; most daemons store their
    294 information elsewhere. Even a complete set of service directories often
    295 amounts to less than a megabyte of data - sometimes much less. Knowing this,
    296 it makes sense to have an image of your service directories in the
    297 (possibly read-only) root filesystem, and <em>copy it all</em>
    298 to a scan directory located on a RAM filesystem that is mounted at boot time.
    299 This is the setup I recommend, and the one used by the
    300 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> service manager.
    301  It has several advantages:
    302  <ul>
    303   <li> Your service directories reside on the root filesystem and are not
    304 modified during the lifetime of the system. If your root filesystem is
    305 read-only and you have a working set of service directories, you have the
    306 guarantee that a reboot will set your system in a working state. </li>
    307  <li> Every boot system requires an early writeable filesystem, and many
    308 create it in RAM. You can take advantage of this to copy your service
    309 directories early and run s6-svscan early. </li>
    310  <li> No dangling symlinks or potential problems with unmounted
    311 filesystems: this setup is robust. A simple <tt>/bin/cp -a</tt> or
    312 <tt>tar -x</tt> is all it takes to get a working service infrastructure. </li>
    313  <li> You can make temporary modifications to your service directories
    314 without affecting the main ones, safely stored on the disk. Conversely,
    315 every boot ensures clean service directories - including freshly created
    316 <tt>supervise</tt> and <tt>event</tt> subdirectories. No stale files can
    317 make your system unstable. </li>
    318  </ul> </li>
    319 </ul>
    320 
    321 </body>
    322 </html>