overview.html (22057B)
1 <html> 2 <head> 3 <meta name="viewport" content="width=device-width, initial-scale=1.0" /> 4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 5 <meta http-equiv="Content-Language" content="en" /> 6 <title>s6: an overview</title> 7 <meta name="Description" content="s6: an overview" /> 8 <meta name="Keywords" content="s6 overview supervision init process unix" /> 9 <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> --> 10 </head> 11 <body> 12 13 <p> 14 <a href="index.html">s6</a><br /> 15 <a href="//skarnet.org/software/">Software</a><br /> 16 <a href="//skarnet.org/">skarnet.org</a> 17 </p> 18 19 <h1> An overview of s6 </h1> 20 21 <p> 22 s6 is a collection of utilities revolving around process supervision and 23 management, logging, and system initialization. This page is a high-level 24 description of the different parts of s6. 25 </p> 26 27 <h2> Process supervision </h2> 28 29 <p> 30 At its core, s6 is a <em>process supervision suite</em>, like its ancestor 31 <a href="https://cr.yp.to/daemontools.html">daemontools</a> and its 32 close cousin 33 <a href="http://smarden.org/runit/">runit</a>. 34 </p> 35 36 <h3> Concept </h3> 37 38 <p> 39 The concept of process supervision comes from several observations: 40 </p> 41 42 <ul> 43 <li> Unix systems, even minimalistic ones, need to run 44 <em>long-lived processes</em>, aka <em>daemons</em>. That is one of the 45 core design principles of Unix: one service → one daemon. </li> 46 <li> Daemons can die unexpectedly. Maybe they are missing a vital 47 resource and cannot handle a certain failure; maybe they tripped on a bug; 48 maybe a misconfigured administration program killed them; maybe the 49 kernel killed them. Processes are fragile, but daemons are vital to a 50 Unix system: a fundamental discrepancy that needs to be solved. </li> 51 <li> Automatically restarting daemons when they die is generally a good 52 thing. In any case, sysadmin intervention is necessary, but at least the 53 daemon is providing service, or trying to, until the sysadmin can log in 54 and investigate the underlying problem. </li> 55 <li> Ad-hoc shell scripts that restart daemons <strong>suck</strong>, for 56 several reasons that would each justify their own page. The difficulty of 57 keeping track of the PID, explained below, is one of those reasons. </li> 58 <li> It is sometimes necessary to send signals to a daemon. To kill it, 59 of course, but also to make it read its config file again, for instance; 60 signalling a daemon is a natural and very common way of sending it 61 simple commands. </li> 62 <li> Generally, to send a signal to a daemon, you need to know its PID. 63 Without a supervision suite, knowing the proper PID is hard. Most 64 non-supervision systems use a hack known as <em>.pid files</em>, i.e. 65 the script that starts the daemon stores its PID into a file, and other 66 scripts read that file. This is a bad mechanism for several reasons, and 67 the case against .pid files would also justify its own page; the most 68 important drawback of .pid files is that they create race conditions 69 and management scripts may kill the wrong process. </li> 70 <li> Non-supervision systems provide scripts to start and stop daemons, 71 but those scripts may fail at boot time even though they work when run 72 manually, 73 and vice versa. If a sysadmin logs in and runs the script to restart a 74 daemon that has died, the result might not be the same as if the whole 75 system had been rebooted, and the daemon may exhibit strange behaviours! 76 This is because the boot-time environment and the restart-time environment 77 are not the same when the script is run; and a non-supervision system 78 just cannot ensure reproducibility of the environment. This is a core 79 problem of non-supervision systems: countless bugs have been falsely 80 reported because of simple environment differences or configuration errors, 81 countless man-hours have been wasted to try and understand what was 82 going on. </li> 83 </ul> 84 85 <p> 86 A process supervision system organizes the process hierarchy in a 87 radically different way. 88 </p> 89 90 <ul> 91 <li> A process supervision system starts an independent hierarchy of 92 processes at boot time, called a <em>supervision tree</em>. This 93 supervision tree never dies: when one of its components dies, it is 94 restarted automatically. To ensure availability of the supervision 95 tree at all times, it should be rooted in process 1, which cannot die. </li> 96 <li> A daemon is never started, either manually or in a script, as a 97 scion of the script that starts it. 98 Instead, to start a daemon, you configure a 99 specific directory which contains all the information about your daemon; 100 then you send a command to the supervision tree. The supervision tree 101 will start the daemon as a leaf. <strong>In a process supervision 102 system, daemons are always spawned by the supervision tree, and 103 never by an admin's shell.</strong> </li> 104 <li> The parent of your daemon is a <em>supervisor</em>. Since your 105 daemon is its direct child, <strong>the supervisor always knows the 106 correct PID of your daemon</strong>. </li> 107 <li> The supervisor watches your daemon and can restart it when it 108 dies, automatically. </li> 109 <li> The supervision tree always has the same environment, so starting 110 conditions are reproducible. Your daemon will always be started with the 111 same environment, whether it is at boot time via init scripts or for the 112 100th automatic - or manual - restart. </li> 113 <li> To send signals to your daemon, you send a command to its 114 supervisor, which will then send a signal to the daemon on your behalf. 115 Your daemon is identified by the directory containing its information, 116 which is stable, instead of by its PID, which is not stable; the supervisor 117 maintains the correct association without a race condition or the other 118 problems of .pid files. </li> 119 </ul> 120 121 <h3> Implementation </h3> 122 123 <p> 124 s6 is a straightforward implementation of those concepts. 125 </p> 126 127 <ul> 128 <li> The <a href="s6-svscan.html">s6-svscan</a> and 129 <a href="s6-supervise.html">s6-supervise</a> programs are the components 130 of the <em>supervision tree</em>. They are long-lived programs. 131 <ul> 132 <li> <a href="s6-supervise.html">s6-supervise</a> is a daemon's 133 <em>supervisor</em>, its direct parent. For every long-lived process on a 134 system, there is a corresponding <a href="s6-supervise.html">s6-supervise</a> 135 process watching it. This is okay, because every instance of 136 <a href="s6-supervise.html">s6-supervise</a> uses very few resources. </li> 137 <li> <a href="s6-svscan.html">s6-svscan</a> is, in a manner of speaking, 138 a supervisor for the supervisors. It watches and maintains a collection of 139 <a href="s6-supervise.html">s6-supervise</a> processes: it is the branch 140 of the supervision tree that all supervisors are stemming from. It can be 141 run and 142 <a href="//skarnet.org/software/s6/s6-svscan-not-1.html">supervised 143 by your regular init process</a>, or it can 144 <a href="//skarnet.org/software/s6/s6-svscan-1.html">run as 145 process 1 itself</a>. Running s6-svscan as process 1 requires 146 some effort from the user, because of the inherent non-portability of 147 init processes; the 148 <a href="//skarnet.org/software/s6-linux-init/">s6-linux-init</a> 149 package automates that effort and allows users to run s6 as an init 150 replacement. </li> 151 <li> The configuration of a daemon to be supervised by 152 <a href="s6-supervise.html">s6-supervise</a> is done via a 153 <a href="servicedir.html">service directory</a>. </li> 154 <li> The place to gather all service directories to be watched by a 155 <a href="s6-svscan.html">s6-svscan</a> instance is called a 156 <a href="scandir.html">scan directory</a>. </li> 157 </ul> 158 <li> The command that controls a single supervisor, and allows you to 159 send signals to a daemon, is 160 <a href="s6-svc.html">s6-svc</a>. It is a short-lived program. </li> 161 <li> The command that controls a set of supervisors, and allows you to 162 start and stop supervision trees, is 163 <a href="s6-svscanctl.html">s6-svscanctl</a>. It is a short-lived 164 program. </li> 165 </ul> 166 167 <p> 168 These four programs, 169 <a href="s6-svscan.html">s6-svscan</a>, 170 <a href="s6-supervise.html">s6-supervise</a>, 171 <a href="s6-svscanctl.html">s6-svscanctl</a> and 172 <a href="s6-svc.html">s6-svc</a>, 173 are the very core of s6. Technically, once you have them, you have a 174 functional s6 installation, and the other utilities are just a bonus. 175 </p> 176 177 <h3> Practical usage </h3> 178 179 <p> 180 To use s6's supervision features, you need to perform the following steps: 181 </p> 182 183 <ul> 184 <li> For every daemon you potentially want supervised, write a 185 <a href="servicedir.html">service directory</a>. Make sure that 186 your daemon does not background itself when started in the 187 <tt>./run</tt> script! Auto-backgrounding is a historical hack 188 that was implemented when supervision suites did not exist; since 189 you're using a supervision suite, auto-backgrounding is unnecessary 190 and in this case detrimental. </li> 191 <li> Write a single <a href="scandir.html">scan directory</a> for 192 the set of daemons you want to actually run. This set can be modified 193 at run time. </li> 194 <li> At some point in your initialization scripts, run 195 <a href="s6-svscan.html">s6-svscan</a> on the scan directory. This will 196 start the supervision tree, including your set of daemons. The exact 197 way of running s6-svscan depends on your system: it is not quite the same 198 when you want to run it as process 1 on a real machine, or under another 199 init on a real machine, or as process 1 in a 200 <a href="https://www.docker.com/">Docker</a> container, or in another 201 context entirely. </li> 202 <li> Alternatively, you can start <a href="s6-svscan.html">s6-svscan</a> 203 on an empty scan directory, then populate it step by step and send an 204 update command to s6-svscan via 205 <a href="s6-svscanctl.html">s6-svscanctl</a> whenever the supervision 206 tree should pick up the differences and start the services you added. </li> 207 <li> That's it, your services are running. To control them manually, 208 you can use the <a href="s6-svc.html">s6-svc</a> command. </li> 209 <li> At the end of the system's lifetime, you can use 210 <a href="s6-svscanctl.html">s6-svscanctl</a> to bring down the supervision 211 tree. </li> 212 </ul> 213 214 <h2> Service-specific logging </h2> 215 216 <p> 217 <a href="s6-svscan.html">s6-svscan</a> can monitor a supervision tree, 218 but it can also do one more thing. It can ensure that a daemon's log, 219 i.e. what the daemon outputs to its stdout (or stderr if you redirect it), 220 gets processed by another, supervised, long-lived process, called a 221 <em>logger</em>; and it can make sure that the logs are never lost 222 between the daemon and the logger - even if the daemon dies, even if the 223 logger dies. 224 </p> 225 226 <p> 227 If your daemon is outputting messages, you have a decision to make 228 about where to send them. 229 </p> 230 231 <ul> 232 <li> You can do as non-supervision systems do, and send the messages 233 to syslog. It's entirely possible with a supervision system too. 234 However, like auto-backgrounding, syslog is a historical mechanism that 235 predates supervision suites, and is technically inferior; it is 236 recommended that you do not use it whenever you can avoid it. </li> 237 <li> You can send them to the daemon's stdout/stderr and do nothing special 238 about it. The logs will then be sent to s6-svscan's stdout/stderr; 239 what mechanism will read them depends on how you started s6-svscan. </li> 240 <li> You can use s6-svscan's service-specific logging mechanism and 241 dedicate a logger process to your daemon's messages. </li> 242 </ul> 243 244 <p> 245 s6 provides you with a long-lived process to use as a logger: 246 <a href="s6-log.html">s6-log</a>. It will store your logs in one (or 247 more) specific directory of your choice, and rotate them automatically. 248 </p> 249 250 <h2> Helpers for run scripts </h2> 251 252 <p> 253 Creating a working 254 <a href="servicedir.html">service directory</a>, and especially a good 255 <em>run script</em>, is the most important part of the work when 256 adapting a daemon to a supervision framework. 257 </p> 258 259 <p> 260 If you can find your daemon's invocation script on a non-supervision system, 261 for instance a System V-style init script, you can see the exact 262 options that the daemon is being run with: environment variables, 263 uid and gid, open descriptors, etc. This is what you 264 need to replicate in your run script. 265 </p> 266 267 <p> 268 (Do not replicate the auto-backgrounding, or things like 269 <a href="http://man.he.net/man8/start-stop-daemon">start-stop-daemon</a> 270 invocation: start-stop-daemon and its friends are hideous and kludgy 271 attempts to work around the lack of proper supervision mechanisms. Now 272 that you have s6, you should remove them from your system, throw them 273 into a bonfire, and dance and laugh while they burn. Generally speaking, 274 as a system administrator you want daemons that have been designed 275 following the principles described 276 <a href="https://jdebp.uk/FGA/unix-daemon-design-mistakes-to-avoid.html">here</a>, 277 or at least you want to use the command-line options that make them 278 behave in such a way.) 279 </p> 280 281 <p> 282 The vast majority of the tools provided by s6 are meant to be used in 283 run scripts: they help you control the process state and 284 environment in your script before it executes into your daemon. Or, 285 sometimes, they are daemons themselves, designed to be supervised. 286 </p> 287 288 <p> 289 s6, like other <a href="//skarnet.org/software/">skarnet.org 290 software</a>, makes heavy use of 291 <a href="https://en.wikipedia.org/wiki/Chain_loading#Chain_loading_in_Unix">chain 292 loading</a>, also known as "Bernstein chaining": a lot of s6 tools will 293 perform some action that changes the process state, then execute into the 294 rest of their command line. This allows the user to change the process state 295 in a very flexible way, by combining the right components in the right 296 order. Very often, a run script can be reduced to a single command line - 297 likely a long one, but still a single one. (That is the main reason why 298 using the 299 <a href="//skarnet.org/software/execline/">execline</a> language 300 to write run scripts is recommended: execline makes it natural to handle 301 long command lines made of massive amounts of chain loading. This is by no 302 means mandatory, though: a run script can be any executable file you want, 303 provided that running it eventually results in a long-lived process with 304 the same PID.) 305 </p> 306 307 <p> 308 Some examples of s6 programs meant to be used in run scripts: 309 </p> 310 311 <ul> 312 <li> The <a href="s6-log.html">s6-log</a> program is a long-lived 313 process. It is meant to be executed into by a <tt>./log/run</tt> 314 script: it will be supervised, and will process what it reads on 315 its stdin (i.e. the output of the <tt>./run</tt> daemon). </li> 316 <li> The <a href="s6-envdir.html">s6-envdir</a> program is a 317 short-lived process that will update its current environment according 318 to what it reads in a given directory, then execute into the rest of its 319 command line. It is meant to be used in a run script to adjust the 320 environment with which the final daemon will be executed into. </li> 321 <li> Similarly, the <a href="s6-softlimit.html">s6-softlimit</a> program 322 adjusts its resource limits, then executes into the rest of its command 323 line: it is meant to set the resources the final daemon will have 324 access to. </li> 325 <li> The <a href="s6-applyuidgid.html">s6-applyuidgid</a> program, 326 part of the <tt>s6-*uidgid</tt> family, drops root privileges before 327 executing into the rest of its command line: it is meant to be used 328 in run scripts that need root privileges when starting but do not 329 need it for the execution of the long-lived process. </li> 330 <li> <a href="s6-ipcserverd.html">s6-ipcserverd</a> is a daemon that 331 listens to a Unix socket and spawns a program for every connection. 332 It is meant to be supervised, so it should be used in a run script, 333 and it's also meant to be a flexible super-server that you can use 334 for different applications: so it is a building block that may appear in 335 several of your run scripts defining 336 <a href="localservice.html">local services</a>. </li> 337 </ul> 338 339 <h2> Readiness notification and dependency management </h2> 340 341 <p> 342 Now that you have a supervision tree, and long-lived processes running 343 supervised, you may want to introduce dependencies between them: do not 344 perform an action (e.g. start (with <a href="s6-svc.html">s6-svc -u</a>) 345 the Web server connecting to a database) 346 before a given daemon is up and running (e.g. the database server). 347 s6 provides tools to do that: 348 </p> 349 350 <ul> 351 <li> The <a href="s6-svwait.html">s6-svwait</a>, 352 <a href="s6-svlisten1.html">s6-svlisten1</a> and 353 <a href="s6-svlisten.html">s6-svlisten</a> programs will wait until a set of 354 daemons is up, ready, down (as soon as the <tt>./run</tt> process dies) or 355 really down (when the <tt>./finish</tt> process has also died). </li> 356 <li> Unfortunately, a daemon being <em>up</em> does not mean that it is 357 <em>ready</em>: 358 <a href="notifywhenup.html">this page</a> goes into the details. s6 359 supports a simple mechanism: when a daemon wants to signal that it is 360 <em>ready</em>, it simply writes a newline to a file descriptor of its 361 choice, and <a href="s6-supervise.html">s6-supervise</a> will pick that 362 notification up and broadcast the information to processes waiting for 363 it. </li> 364 <li> s6 also has a legacy mechanism for daemons that do not 365 notify their own readiness but provide a way for an external program 366 to check whether they're ready or not: 367 <a href="s6-notifyoncheck.html">s6-notifyoncheck</a>. 368 This is polling, which is bad, but unfortunately necessary for 369 many daemons as of 2019. </li> 370 </ul> 371 372 <p> 373 s6 does not provide a complete dependency management framework, 374 i.e. a program to automatically start (or stop) a set of services in a 375 specific order - that order being automatically computed from a graph of 376 dependencies between services. 377 That functionality belongs to a <em>service manager</em>, and is 378 implemented for instance in the 379 <a href="//skarnet.org/software/s6-rc/">s6-rc</a> package. 380 </p> 381 382 <h2> Fine-grained control over services </h2> 383 384 <p> 385 s6 provides you with a few more tools to control and monitor your 386 services. For instance: 387 </p> 388 389 <ul> 390 <li> <a href="s6-svstat.html">s6-svstat</a> gives you access to 391 the detailed state of a service </li> 392 <li> <a href="s6-svperms.html">s6-svperms</a> allows you to configure 393 what users can read that state, what users can send control 394 commands to your service, and what users can be notified of 395 service start/stop events </li> 396 <li> <a href="s6-svdt.html">s6-svdt</a> 397 allows you to see what caused the latest deaths of a supervised 398 process </li> 399 </ul> 400 401 <p> 402 These tools make s6 the most powerful and flexible of the existing 403 process supervision suites. 404 </p> 405 406 <h2> Additional utilities </h2> 407 408 <p> 409 The other programs in the s6 package are various utilities that may be 410 useful in designing servers, and more generally multi-process software. 411 They can be used with or without a supervision environment, although 412 it is of course recommended to have one; but they are not part of the core s6 413 functionality, and you may safely ignore them for now if you are just getting 414 into the supervision world. 415 </p> 416 417 <h3> Generic inter-process notification </h3> 418 419 <p> 420 The <tt>s6-ftrig*</tt> family of programs allows notifications between 421 unrelated processes: a set of processes can subscribe to a certain 422 channel - identified by a directory in the filesystem - and ask to be 423 notified of certain events on that channel; another set of processes can 424 send events to the channel. 425 </p> 426 427 <p> 428 The underlying mechanism is the same as the one used by the supervision 429 tree for readiness notification, but the <tt>s6-ftrig*</tt> tools provide 430 a more generic access to that mechanism. 431 </p> 432 433 <h3> Helpers for designing local services </h3> 434 435 <p> 436 Local services, i.e. daemons listening to a Unix domain socket, are a 437 powerful and flexible mechanism, especially with modern Unix systems 438 that allow client authentication. s6 includes tools to take advantage 439 of that mechanism. 440 </p> 441 442 <ul> 443 <li> The <tt>s6-ipc*</tt> family of programs is about designing clients 444 or servers that communicate over Unix domain sockets. </li> 445 <li> The <tt>s6-*access*</tt> and <a href="s6-connlimit.html">s6-connlimit</a> 446 family of programs is about client access control. </li> 447 <li> The <tt>s6-sudo*</tt> family of programs is about using a local 448 service in order to give selected 449 clients the ability to run a command line with the privileges of the 450 server, without using suid programs. </li> 451 </ul> 452 453 <h3> Keeping file descriptors open </h3> 454 455 <p> 456 Sometimes you want to keep a file descriptor open, even if the program 457 normally using it dies - so the program can restart and use the same 458 file descriptor without losing any data. To do that, you need to 459 <em>hold</em> the descriptor in another process, i.e. that process 460 should have it open but do nothing with it. 461 </p> 462 463 <p> 464 <a href="s6-svscan.html">s6-svscan</a>, for instance, holds the pipe 465 existing between a supervised daemon and its logger, so even if the 466 daemon or the logger dies while there are logs in the pipe, the pipe 467 remains open and the logs are not lost. 468 </p> 469 470 <p> 471 s6 provides a mechanism to store and retrieve open file descriptors 472 in a totally generic way: the <tt>s6-fdholder*</tt> family of programs. 473 </p> 474 475 <ul> 476 <li> The <a href="s6-fdholder-daemon.html">s6-fdholder-daemon</a> program 477 is a daemon (or, rather, executes into the 478 <a href="s6-fdholderd.html">s6-fdholderd</a> daemon), meant to be 479 supervised, that will hold file descriptors on its clients' behalf. </li> 480 <li> Other programs in the family, such as 481 <a href="s6-fdholder-store.html">s6-fdholder-store</a>, are client 482 programs that interact with this daemon to store and retrieve file 483 descriptors. </li> 484 </ul> 485 486 <p> 487 Note that "socket activation", one of the main advertised benefits of the 488 <a href="https://www.freedesktop.org/wiki/Software/systemd/">systemd</a> 489 init system, sounds similar to fd-holding. 490 The reality is that socket activation is a mixture of several different 491 mechanisms, one of which is fd-holding; s6 allows you to implement the 492 <a href="socket-activation.html">healthy parts</a> of socket activation. 493 </p> 494 495 <h3> Other miscellaneous utilities </h3> 496 497 <p> 498 This page does not list or classify every s6 tool. Please 499 explore the "Reference" section of the 500 <a href="index.html">main s6 page</a> for details on a specific program. 501 </p> 502 503 </body> 504 </html>