servicedir.html (18408B)
1 <html> 2 <head> 3 <meta name="viewport" content="width=device-width, initial-scale=1.0" /> 4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 5 <meta http-equiv="Content-Language" content="en" /> 6 <title>s6: service directories</title> 7 <meta name="Description" content="s6: service directory" /> 8 <meta name="Keywords" content="s6 supervision supervise service directory run finish servicedir" /> 9 <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> --> 10 </head> 11 <body> 12 13 <p> 14 <a href="index.html">s6</a><br /> 15 <a href="//skarnet.org/software/">Software</a><br /> 16 <a href="//skarnet.org/">skarnet.org</a> 17 </p> 18 19 <h1> Service directories </h1> 20 21 <p> 22 A <em>service directory</em> is a directory containing all the information 23 related to a <em>service</em>, i.e. a long-running process maintained and 24 supervised by <a href="s6-supervise.html">s6-supervise</a>. 25 </p> 26 27 <p> 28 (Strictly speaking, a <em>service</em> is not always equivalent to a 29 long-running process. Things like Ethernet interfaces fit the definition 30 of <em>services</em> one may want to supervise; however, s6 does not 31 provide <em>service supervision</em>; it provides <em>process supervision</em>, 32 and it is impractical to use the s6 architecture as is to supervise 33 services that are not equivalent to one long-running process. However, 34 we still use the terms <em>service</em> and <em>service directory</em> 35 for historical and compatibility reasons.) 36 </p> 37 38 <h2> Contents </h2> 39 40 A service directory <em>foo</em> may contain the following elements: 41 42 <ul> 43 <li style="margin-bottom:1em"> An executable file named <tt>run</tt>. It can be any executable 44 file (such as a binary file or a link to any other executable file), 45 but most of the time it will be a script, called <em>run script</em>. 46 This file is the most important one in your service directory: it 47 contains the commands that will setup and run your <em>foo</em> service. 48 <ul> 49 <li> It is forked and executed by <a href="s6-supervise.html">s6-supervise</a> 50 every time the service must be started, i.e. normally when 51 <a href="s6-supervise.html">s6-supervise</a> starts, and whenever 52 the service goes down when it is supposed to be up. </li> 53 <li> It is given one argument, which is the same argument that the 54 <a href="s6-supervise.html">s6-supervise</a> process is running with, 55 i.e. the name of the service directory — or, if 56 <a href="s6-supervise.html">s6-supervise</a> is run under 57 <a href="s6-svscan.html">s6-svscan</a>, the name of the service directory 58 as seen by <a href="s6-svscan.html">s6-svscan</a> in its 59 <a href="scandir.html">scan directory</a>. That is, <tt><em>foo</em></tt> 60 or <tt><em>foo</em>/log</tt>, if <em>foo</em> is the name of the 61 <em>symbolic link</em> in the scan directory. </li> </ul> 62 63 <p> A run script should normally: </p> 64 <ul> 65 <li> adjust redirections for stdin, stdout and stderr. When a run 66 script starts, it inherits its standard file descriptors from 67 <a href="s6-supervise.html">s6-supervise</a>, which itself inherits them from 68 <a href="s6-svscan.html">s6-svscan</a>. stdin is normally <tt>/dev/null</tt>. 69 If s6-svscan was launched by another init system, stdout and stderr likely 70 point to that init system's default log (or <tt>/dev/null</tt> in the case 71 of sysvinit). If s6-svscan is running as pid 1 via the help of software like 72 <a href="//skarnet.org/software/s6-linux-init/">s6-linux-init</a>, then its 73 stdout and stderr point to a <em>catch-all logger</em>, which catches and 74 logs any output of the supervision tree that has not been caught by a 75 dedicated logger. If the defaults provided by your installation are not 76 suitable for your run script, then your run script should perform the proper 77 redirections before executing into the final daemon. For instance, dedicated 78 logging mechanisms, such as the <tt>log</tt> subdirectory (see below) or the 79 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> pipeline feature, pipe your 80 run script's <em>stdout</em> to the logging service, but chances are you want 81 to log <em>stderr</em> as well, so the run script should make sure that its 82 stderr goes into the log pipe. This 83 is achieved by <tt><a href="//skarnet.org/software/execline/fdmove.html">fdmove</a> 84 -c 2 1</tt> in <a href="//skarnet.org/software/execline/">execline</a>, 85 and <tt>exec 2>&1</tt> in <a href="https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sh.html">shell</a>. 86 </li> 87 <li> adjust the environment for your <em>foo</em> daemon. Normally the run script 88 inherits its environment from <a href="s6-supervise.html">s6-supervise</a>, 89 which normally inherits its environment from <a href="s6-svscan.html">s6-svscan</a>, 90 which normally inherits a minimal environment from the boot scripts. 91 Service-specific environment variables should be set in the run script. </li> 92 <li> adjust other parameters for the <em>foo</em> daemon, such as its 93 uid and gid. Normally the supervision tree, i.e. 94 <a href="s6-svscan.html">s6-svscan</a> and the various 95 <a href="s6-supervise.html">s6-supervise</a> processes, is run as root, so 96 run scripts are also run as root; however, for security purposes, services 97 should not run as root if they don't need to. You can use the 98 <a href="s6-setuidgid.html">s6-setuidgid</a> utility in <em>foo</em><tt>/run</tt> 99 to lose privileges before executing into <em>foo</em>'s long-lived 100 process; or the <a href="s6-envuidgid.html">s6-envuidgid</a> utility if 101 your long-lived process needs root privileges at start time but can drop 102 them afterwards. </li> 103 <li> execute into the long-lived process that is to be supervised by 104 <a href="s6-supervise.html">s6-supervise</a>, i.e. the real <em>foo</em> 105 daemon. That process must not "background itself": being run by a supervision 106 tree already makes it a "background" task. </li> 107 </ul> </li> 108 109 <li style="margin-bottom:1em"> An optional executable file named <tt>finish</tt>. Like <tt>run</tt>, 110 it can be any executable file. This <em>finish script</em>, if present, 111 is executed everytime the <tt>run</tt> script dies. Generally, its main 112 purpose is to clean up non-volatile data such as the filesystem after the supervised 113 process has been killed. If the <em>foo</em> service is supposed to be up, 114 <em>foo</em><tt>/run</tt> is restarted after <em>foo</em><tt>/finish</tt> dies. 115 <ul> 116 <li> By default, a finish script must do its work and exit in less than 117 5 seconds; if it takes more than that, it is killed. (The point is that the run 118 script, not the finish script, should be running; the finish script should really 119 be short-lived.) The maximum duration of a <tt>finish</tt> execution can be 120 configured via the <tt>timeout-finish</tt> file, see below. </li> 121 <li> The finish script is executed with three arguments: 122 <ol> 123 <li> the exit code from the run script (resp. 256 if the run script was killed by a signal) </li> 124 <li> an undefined number (resp. the number of the signal that killed the run script) </li> 125 <li> the name of the service directory, the same that has been given to <tt>./run</tt>. </li> 126 </ol> 127 <li> If the finish script exits 125, then <a href="s6-supervise.html">s6-supervise</a> 128 interprets this as a permanent failure for the service, and does not restart it, 129 as if an <a href="s6-svc.html">s6-svc -O</a> command had been sent. </li> 130 <li> If <a href="s6-supervise.html">s6-supervise</a> has been instructed to exit after 131 the service dies, via a <tt>s6-svc -x</tt> command or a SIGHUP, then the next 132 invocation of <tt>finish</tt> will (obviously) be the last, and it will run with 133 stdin and stdout pointing to <tt>/dev/null</tt>. </li> 134 </ul> </li> 135 136 <li style="margin-bottom:1em"> A directory named <tt>supervise</tt>. It is automatically created by 137 <a href="s6-supervise.html">s6-supervise</a> if it does not exist. This is where 138 <a href="s6-supervise.html">s6-supervise</a> stores its internal information. 139 The directory must be writable. </li> 140 141 <li style="margin-bottom:1em"> An optional, empty, regular file named <tt>down</tt>. If such a file exists, 142 the default state of the service is considered down, not up: s6-supervise will not 143 automatically start it until it receives a <tt>s6-svc -u</tt> command. If no 144 <tt>down</tt> file exists, the default state of the service is up. </li> 145 146 <li style="margin-bottom:1em"> An optional regular file named <tt>notification-fd</tt>. If such a file 147 exists, it means that the service supports 148 <a href="notifywhenup.html">readiness notification</a>. The file must only 149 contain an unsigned integer, which is the number of the file descriptor that 150 the service writes its readiness notification to. (For instance, it should 151 be 1 if the daemon is <a href="s6-ipcserverd.html">s6-ipcserverd</a> run with the 152 <tt>-1</tt> option.) 153 When a service is started, or restarted, by s6-supervise, if this file 154 exists and contains a valid descriptor number, s6-supervise will wait for the 155 notification from the service and broadcast readiness, i.e. any 156 <a href="s6-svwait.html">s6-svwait -U</a>, 157 <a href="s6-svlisten1.html">s6-svlisten1 -U</a> or 158 <a href="s6-svlisten.html">s6-svlisten -U</a> processes will be 159 triggered. </li> 160 161 <li style="margin-bottom:1em"> An optional regular file named <tt>lock-fd</tt>. If such a file 162 exists, it must contain an unsigned integer, representing a file descriptor that 163 will be open in the service. The service <em>should not write to that descriptor</em> 164 and <em>should not close it</em>. In other words, it should totally ignore it. That 165 file descriptor holds a lock, that will naturally be released when the service dies. 166 The point of this feature is to prevent s6-supervise from accidentally spawning several 167 copies of the service in case something goes wrong: for instance, the service 168 backgrounds itself (which it shouldn't do when running under a supervision suite), or 169 s6-supervise is killed, restarted by s6-svscan, and attempts to start another copy of 170 the service while the first copy is still alive. If s6-supervise detects that the lock 171 is held when it tries to start the service, it will print a warning message; the new 172 service instance will block until the lock is released, then proceed as usual. </li> 173 174 <li style="margin-bottom:1em"> An optional regular file named <tt>timeout-kill</tt>. If such a file 175 exists, it must only contain an unsigned integer <em>t</em>. If <em>t</em> 176 is nonzero, then on receipt of an <a href="s6-svc.html">s6-svc -d</a> command, 177 which sends a SIGTERM (by default, see <tt>down-signal</tt> below) and a 178 SIGCONT to the service, a timeout of <em>t</em> 179 milliseconds is set; and if the service is still not dead after <em>t</em> 180 milliseconds, then it is sent a SIGKILL. If <tt>timeout-kill</tt> does not 181 exist, or contains 0 or an invalid value, then the service is never 182 forcibly killed (unless, of course, an <a href="s6-svc.html">s6-svc -k</a> 183 command is sent). </li> 184 185 <li style="margin-bottom:1em"> An optional regular file named <tt>timeout-finish</tt>. If such a file 186 exists, it must only contain an unsigned integer, which is the number of 187 milliseconds after which the <tt>./finish</tt> script, if it exists, will 188 be killed with a SIGKILL. The default is 5000: finish scripts are killed 189 if they're still alive after 5 seconds. A value of 0 allows finish scripts 190 to run forever. </li> 191 192 <li style="margin-bottom:1em"> An optional regular file named <tt>max-death-tally</tt>. If such a file 193 exists, it must only contain an unsigned integer, which is the maximum number of 194 service death events that s6-supervise will keep track of. If the service dies 195 more than this number of times, the oldest events will be forgotten. Tracking 196 death events is useful, for instance, when throttling service restarts. The 197 value cannot be greater than 4096. If the file does not exist, a default of 100 198 is used. </li> 199 200 <li style="margin-bottom:1em"> An optional regular file named <tt>down-signal</tt>. If such a file 201 exists, it must only contain the name or number of a signal, followed by a 202 newline. This signal will be used to kill the supervised process when a 203 <a href="s6-svc.html">s6-svc -d</a> or <a href="s6-svc.html">s6-svc -r</a> 204 command is used. If the file does not exist, SIGTERM will be used by default. </li> 205 206 <li style="margin-bottom:1em"> A <a href="fifodir.html">fifodir</a> named <tt>event</tt>. It is automatically 207 created by <a href="s6-supervise.html">s6-supervise</a> if it does not exist. 208 <em>foo</em><tt>/event</tt> 209 is the rendez-vous point for listeners, where <a href="s6-supervise.html">s6-supervise</a> 210 will send notifications when the service goes up or down. </li> 211 212 <li style="margin-bottom:1em"> Optional directories named <tt>instance</tt> 213 and <tt>instances</tt>. Those are internal subdirectories created by 214 <a href="s6-instance-maker.html">s6-instance maker</a> in a templated service 215 directory. Outside of instanced services, these directories should never 216 appear, and you should never create them manually. </li> 217 218 219 <li style="margin-bottom:1em"> An optional service directory named <tt>log</tt>. If it exists and <em>foo</em> 220 is in a <a href="scandir.html">scandir</a>, and <a href="s6-svscan.html">s6-svscan</a> 221 runs on that scandir, then <em>two</em> services are monitored: <em>foo</em> and 222 <em>foo</em><tt>/log</tt>. A pipe is open and maintained between <em>foo</em> and 223 <em>foo</em><tt>/log</tt>, i.e. everything that <em>foo</em><tt>/run</tt> 224 writes to its stdout will appear on <em>foo</em><tt>/log/run</tt>'s stdin. The <em>foo</em> 225 service is said to be <em>logged</em>; the <em>foo</em><tt>/log</tt> service is called 226 <em>foo</em>'s <em>logger</em>. A logger service cannot be logged: if 227 <em>foo</em><tt>/log/log</tt> exists, nothing special happens. </li> 228 </ul> 229 230 <h3> Stability </h3> 231 232 <p> 233 With the evolution of s6, it is possible that 234 <a href="s6-supervise.html">s6-supervise</a> configuration uses more and more 235 files in the service directory. The 236 <tt>notification-fd</tt> and <tt>timeout-finish</tt> files, for 237 instance, have appeared in 2015; users who previously had files 238 with the same name had to change them. There is no guarantee that 239 <a href="s6-supervise.html">s6-supervise</a> will not use additional 240 names in the service directory in the same fashion in the future. 241 </p> 242 243 <p> 244 There <em>is</em>, however, a guarantee that 245 <a href="s6-supervise.html">s6-supervise</a> will never touch 246 subdirectories named <tt>data</tt> or <tt>env</tt>. So if you 247 need to store user information in the service directory with 248 the guarantee that it will never be mistaken for a configuration 249 file, no matter the version of s6, you should store that information in 250 the <tt>data</tt> or <tt>env</tt> subdirectories of the service 251 directory. 252 </p> 253 254 <a name="where"> 255 <h2> Where should I store my service directories? </h2> 256 </a> 257 258 <p> 259 Service directories describe the way services are launched. Once they are 260 designed, they have little reason to change on a given machine. They can 261 theoretically reside on a read-only filesystem - for instance, the root 262 filesystem, to avoid problems with mounting failures. 263 </p> 264 265 <p> 266 However, two subdirectories - namely <tt>supervise</tt> and <tt>event</tt> - 267 of every service directory need to be writable. So it has to be a bit more 268 complex. Here are a few possibilities. 269 </p> 270 271 <ul> 272 <li> The laziest option: you're not using <a href="s6-svscan.html">s6-svscan</a> 273 as process 1, you're only using it to start a collection of services, and 274 your booting process is already handled by another init system. Then you can 275 just store your service directories and your <a href="scandir.html">scan 276 directory</a> on some read-write filesystem such as <tt>/var</tt>; and you 277 tell your init system to launch (and, if possible, maintain) s6-svscan on 278 the scan directory after that filesystem is mounted. </li> 279 <li> The almost-as-lazy option: just have the service directories on the 280 root filesystem. Then your service directory collection is for instance in 281 <tt>/etc/services</tt> and you have a <tt>/service</tt> 282 <a href="scandir.html">scan directory</a> containing symlinks to that 283 collection. This is the easy setup, not requiring an external init system 284 to mount your filesystems - however, it requires your root filesystem to be 285 read-write, which is unacceptable if you are concerned with reliability - if 286 you are, for instance, designing an embedded platform. </li> 287 <li> <a href="https://code.dogmap.org/">Some people</a> like to have 288 their service directories in a read-only filesystem, with <tt>supervise</tt> 289 symlinks pointing to various places in writable filesystems. This setup looks 290 a bit complex to me: it requires careful handling of the writable 291 filesystems, with not much room for error if the directory structure does not 292 match the symlinks (which are then dangling). But it works. </li> 293 <li> Service directories are usually small; most daemons store their 294 information elsewhere. Even a complete set of service directories often 295 amounts to less than a megabyte of data - sometimes much less. Knowing this, 296 it makes sense to have an image of your service directories in the 297 (possibly read-only) root filesystem, and <em>copy it all</em> 298 to a scan directory located on a RAM filesystem that is mounted at boot time. 299 This is the setup I recommend, and the one used by the 300 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> service manager. 301 It has several advantages: 302 <ul> 303 <li> Your service directories reside on the root filesystem and are not 304 modified during the lifetime of the system. If your root filesystem is 305 read-only and you have a working set of service directories, you have the 306 guarantee that a reboot will set your system in a working state. </li> 307 <li> Every boot system requires an early writeable filesystem, and many 308 create it in RAM. You can take advantage of this to copy your service 309 directories early and run s6-svscan early. </li> 310 <li> No dangling symlinks or potential problems with unmounted 311 filesystems: this setup is robust. A simple <tt>/bin/cp -a</tt> or 312 <tt>tar -x</tt> is all it takes to get a working service infrastructure. </li> 313 <li> You can make temporary modifications to your service directories 314 without affecting the main ones, safely stored on the disk. Conversely, 315 every boot ensures clean service directories - including freshly created 316 <tt>supervise</tt> and <tt>event</tt> subdirectories. No stale files can 317 make your system unstable. </li> 318 </ul> </li> 319 </ul> 320 321 </body> 322 </html>