s6

Mirror/fork of https://skarnet.org/software/s6/
git clone https://ccx.te2000.cz/git/s6
Log | Files | Refs | README | LICENSE

s6-permafailon.html (4121B)


      1 <html>
      2   <head>
      3     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
      4     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      5     <meta http-equiv="Content-Language" content="en" />
      6     <title>s6: the s6-permafailon program</title>
      7     <meta name="Description" content="s6: the s6-permafailon program" />
      8     <meta name="Keywords" content="s6 supervision finish permanent failure service" />
      9     <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
     10   </head>
     11 <body>
     12 
     13 <p>
     14 <a href="index.html">s6</a><br />
     15 <a href="//skarnet.org/software/">Software</a><br />
     16 <a href="//skarnet.org/">skarnet.org</a>
     17 </p>
     18 
     19 <h1> The <tt>s6-permafailon</tt> program </h1>
     20 
     21 <p>
     22 <tt>s6-permafailon</tt> is a program that is meant to be used
     23 in the <tt>./finish</tt> script of a
     24 <a href="servicedir.html">service directory</a> supervised by
     25 <a href="s6-supervise.html">s6-supervise</a>. When used, it
     26 reads and analyses the death tally of a service (i.e. the recent
     27 process death events that happened), and if the death tally
     28 matches a given pattern, it causes <em>permanent failure</em>
     29 of the service, i.e. it tells the supervisor not to try and
     30 restart it.
     31 </p>
     32 
     33 <h2> Interface </h2>
     34 
     35 <pre>
     36      s6-permafailon <em>secs</em> <em>deathcount</em> <em>events</em> <em>prog...</em>
     37 </pre>
     38 
     39 <ul>
     40  <li> <tt>s6-permafailon</tt> must have the service directory of the
     41 tested service as its current directory. This is the default if it is
     42 called from the <tt>finish</tt> script of the service. </li>
     43  <li> It reads the <em>death tally</em> of the service, which is
     44 maintained by <a href="s6-supervise.html">s6-supervise</a>. </li>
     45  <li> If the supervised process has died at least <em>deathcount</em>
     46 times in the last <em>secs</em> seconds with a cause listed in
     47 <em>events</em>, then <tt>s6-permafailon</tt> exits 125. </li>
     48  <li> Else <tt>s6-permafailon</tt> execs into <em>prog...</em>. </li>
     49 </ul>
     50 
     51 <p>
     52  <em>events</em> is a comma-separated list of events. An event can be
     53 one of the following:
     54 </p>
     55 
     56 <ul>
     57  <li> An exit code, which is an integer between 0 and 255. Example: <tt>1</tt> </li>
     58  <li> An exit code interval, which is two exit codes separated by a dash. Example: <tt>1-50</tt> </li>
     59  <li> A signal name, or a signal number preceded by "SIG". Examples: <tt>SIGTERM</tt>, <tt>sigabrt</tt>, <tt>sig11</tt> </li>
     60 </ul>
     61 
     62 <h2> Usage </h2>
     63 
     64 <ul>
     65   <li> <a href="s6-supervise.html">s6-supervise</a> detects when the <tt>./finish</tt>
     66 script of its service exits 125, and stops respawning the service. So, if the
     67 <tt>./finish</tt> script is a chain-loading command line starting with a
     68 <tt>s6-permafailon</tt> invocation (or containing such an invocation), when
     69 <tt>s6-permafailon</tt> exits 125, then the <tt>./finish</tt> script also
     70 exits 125 (because it is the same process), and the service is then marked as
     71 failing permanently. </li>
     72  <li> The <tt>./finish</tt> script is <em>naturally</em> a chain-loading
     73 command line if it is written in the
     74 <a href="//skarnet.org/software/execline/">execline</a> language. It
     75 can also be made into a chain-loading command line from a shell script by using
     76 <tt>exec s6-permafailon secs deathcount events rest-of-chainloading-cmdline...</tt> </li>
     77  <li> Multiple invocations of <tt>s6-permafailon</tt> can be chained, in order
     78 to test several death patterns. </li>
     79  <li> If a permanent failure is triggered and <em>secs</em> is high, it is
     80 possible that when the administrator manually launches the service again,
     81 the next death triggers a permanent failure again. If this is not wanted,
     82 the administrator should clear the death tally with the
     83 <a href="s6-svdt-clear.html">s6-svdt-clear</a> command. </li>
     84  <li> The current death tally can be viewed via the <a href="s6-svdt.html">s6-svdt</a>
     85 command. </li>
     86 </ul>
     87 
     88 <h2> Example </h2>
     89 
     90 <p>
     91  <tt>s6-permafailon 60 5 1,101-103,SIGSEGV,SIGBUS <em>prog...</em></tt>
     92 will exit 125 if the service has died 5 times in the last 60 seconds with
     93 an exit code of 1, 101, 102 or 103, a SIGSEGV or a SIGBUS. Else it will
     94 chainload into the <em>prog...</em> command line.
     95 </p>
     96 
     97 </body>
     98 </html>