skalibs

Mirror/fork of https://skarnet.org/software/skalibs/
git clone https://ccx.te2000.cz/git/skalibs
Log | Files | Refs | README | LICENSE

selfpipe.html (8830B)


      1 <html>
      2   <head>
      3     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
      4     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      5     <meta http-equiv="Content-Language" content="en" />
      6     <title>skalibs: the selfpipe library interface</title>
      7     <meta name="Description" content="skalibs: the selfpipe library interface" />
      8     <meta name="Keywords" content="skalibs stddjb libstddjb selfpipe self-pipe library interface" />
      9     <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> -->
     10   </head>
     11 <body>
     12 
     13 <p>
     14 <a href="index.html">libstddjb</a><br />
     15 <a href="../libskarnet.html">libskarnet</a><br />
     16 <a href="../index.html">skalibs</a><br />
     17 <a href="//skarnet.org/software/">Software</a><br />
     18 <a href="//skarnet.org/">skarnet.org</a>
     19 </p>
     20 
     21 <h1> The <tt>selfpipe</tt> library interface </h1>
     22 
     23 <p>
     24  The selfpipe functions are declared in the
     25 <tt>skalibs/selfpipe.h</tt> header and implemented in the <tt>libskarnet.a</tt>
     26 or <tt>libskarnet.so</tt> library.
     27 </p>
     28 
     29 <h2> What does it do&nbsp;? </h2>
     30 
     31 <p>
     32 Signal handlers suck.
     33 </p>
     34 
     35 <p>
     36 They do. I don't care how experienced you are with C/Unix programming,
     37 they do. You can be Ken Thompson, if you use signal handlers as a
     38 regular part of your C programming model, you <em>are</em> going to
     39 screw up, and write buggy code.
     40 </p>
     41 
     42 <p>
     43  Unix is tricky enough with interruptions. Even when you have a single
     44 thread, signals can make the execution flow very non-intuitive.
     45 They mess up the logic of linear and structured code,
     46 they introduce non-determinism; you always have to think "and what
     47 if I get interrupted here and the flow goes into a handler...". This
     48 is annoying.
     49 </p>
     50 
     51 <p>
     52  Moreover, signal handler code is <em>very</em> limited in what it can
     53 do. It can't use any non-reentrant function! If you call a non-reentrant
     54 function, and by chance you were precisely in that non-reentrant function
     55 code when you got interrupted by a signal... you lose. That means, no
     56 malloc(). No bufferized IO. No globals. The list goes on and on. <br />
     57  If you're going to catch signals, you'll want to handle them <em>outside</em>
     58 the signal handler. You actually want to spend <em>the least possible
     59 time</em> inside a signal handler - just enough to notify your main
     60 execution flow that there's a signal to take care of.
     61 </p>
     62 
     63 <p>
     64  And, of course, signal handlers don't mix with event loops, which is
     65 a classic source of headaches for programmers and led to the birth of
     66 abominations such as
     67 <a href="https://www.opengroup.org/onlinepubs/009695399/functions/pselect.html">
     68 pselect</a>. So much for the "everything is a file" concept that Unix was
     69 built on.
     70 </p>
     71 
     72 <p>
     73  A signal should be an event like any other.
     74 There should be a unified interface - receiving a signal should make some
     75 fd readable or something.
     76 </p>
     77 
     78 <p>
     79  And that's exactly what the
     80 <a href="https://cr.yp.to/docs/selfpipe.html">self-pipe trick</a>, invented
     81 by <a href="../djblegacy.html">DJB</a>, does.
     82 </p>
     83 
     84 <p>
     85  As long as you're in some kind of event loop, the self-pipe trick allows
     86 you to forget about signal handlers... <em>forever</em>. It works this way:
     87 </p>
     88 
     89 <ol>
     90  <li> Create a pipe <tt>p</tt>. Make both ends close-on-exec and nonblocking. </li>
     91  <li> Write a tiny signal handler ("top half") for all the signals you want to
     92 catch. This
     93 signal handler should just write one byte into <tt>p[1]</tt>, and do nothing
     94 more; ideally, the written byte identifies the signal. </li>
     95  <li> In your event loop, add <tt>p[0]</tt> to the list of fds you're watching
     96 for readability. </li>
     97 </ol>
     98 
     99 <p>
    100  When you get a signal, a byte will be written to the self-pipe, and your
    101 execution flow will resume. When you next go through the event loop,
    102 <tt>p[0]</tt> will be readable; you'll then be able to read a byte from
    103 it, identify the signal, and handle it - in your unrestricted main
    104 environment (the "bottom half" of the handler).
    105 </p>
    106 
    107 <p>
    108  The selfpipe library does it all for you - you don't even have to write
    109 the top half yourself. You can forget their existence and recover
    110 some peace of mind.
    111 </p>
    112 
    113 <p>
    114  Note that in an asynchronous event loop, you need to protect your
    115 system calls against EINTR by using <a href="safewrappers.html">safe
    116 wrappers</a>.
    117 </p>
    118 
    119 <h2> How do I use it&nbsp;? </h2>
    120 
    121 <h3> Starting </h3>
    122 
    123 <pre>
    124 int fd = selfpipe_init() ;
    125 </pre>
    126 
    127 <p>
    128 <tt>selfpipe_init()</tt> sets up a selfpipe. You must use that
    129 function first. <br />
    130 If <tt>fd</tt> is -1, then an error occurred. Else <tt>fd</tt> is a
    131 non-blocking descriptor that can be used in your event loop. It will
    132 be selected for readability when you've caught a signal.
    133 </p>
    134 
    135 <h3> Trapping signals </h3>
    136 
    137 <pre>
    138 int r = selfpipe_trap(SIGTERM) ;
    139 </pre>
    140 
    141 <p>
    142 <tt>selfpipe_trap()</tt> catches a signal and sends it to the selfpipe.
    143 Uncaught signals won't trigger the selfpipe. <tt>r</tt> is 1 if
    144 the operation succeeded, and 0 if it failed. If it succeeded, you
    145 can forget about the trapped signal entirely. <br />
    146 In our example, if <tt>r</tt> is 1, then a SIGTERM will instantly
    147 trigger readability on <tt>fd</tt>.
    148 </p>
    149 
    150 <pre>
    151 int r ;
    152 sigset_t set ;
    153 sigemptyset(&amp;set) ;
    154 sigaddset(&amp;set, SIGTERM) ;
    155 sigaddset(&amp;set, SIGHUP) ;
    156 r = selfpipe_trapset(&amp;set) ;
    157 </pre>
    158 
    159 <p>
    160 <tt>selfpipe_trap()</tt> handles signals one
    161 by one. Alternatively (and often preferrably), you can use
    162 <tt>selfpipe_trapset()</tt> to directly handle signal sets. When you call
    163 <tt>selfpipe_trapset()</tt>, signals that are present in <tt>set</tt> will
    164 be caught by the selfpipe, and signals that are absent from <tt>set</tt>
    165 will be uncaught. <tt>r</tt> is 1 if the operation succeeded and 0 if it
    166 failed.
    167 </p>
    168 
    169 <h3> Handling events </h3>
    170 
    171 <pre>
    172 int c = selfpipe_read() ;
    173 </pre>
    174 
    175 <p>
    176  Call <tt>selfpipe_read()</tt> when your <tt>fd</tt> is readable.
    177 That's where you write your <em>real</em> signal handler: in the
    178 body of your event loop, in a "normal" context. <br />
    179 <tt>c</tt> is -1 if an error occurred - in which case chances are
    180 it's a serious one and your system has become very unstable.
    181 <tt>c</tt> is 0 if there are no more pending signals. If <tt>c</tt>
    182 is positive, it is the number of the signal that was caught.
    183 </p>
    184 
    185 <h3> Accessing the selfpipe </h3>
    186 
    187 <pre>
    188 int fd = selfpipe_fd() ;
    189 </pre>
    190 
    191 <p>
    192  Sometimes you need to access the fd of the selfpipe in two
    193 very distinct translation units (typically to poll on it), and you
    194 rightly don't want to add a global variable to store it, especially
    195 since it's already stored in a global internal variable in skalibs.
    196 No need to bloat your binary anymore: <tt>selfpipe_fd()</tt> will
    197 now retrieve the value for you, wherever you are.
    198 </p>
    199 
    200 <h3> Finishing </h3>
    201 
    202 <pre>
    203 selfpipe_finish() ;
    204 </pre>
    205 
    206 <p>
    207  Call <tt>selfpipe_finish()</tt> when you're done using the selfpipe.
    208 Signal handlers will be restored to SIG_DFL, i.e. signals will not
    209 be trapped anymore.
    210 </p>
    211 
    212 <h2> Any limitations&nbsp;? </h2>
    213 
    214 <p>
    215  Some, as always.
    216 </p>
    217 
    218 <ul>
    219  <li> The selfpipe library uses a global pipe;
    220 so, it's theoretically not safe for multithreading. However, as long as you dedicate
    221 one thread to signal handling and block signals in all the other threads
    222 (see <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_sigmask.html">pthread_sigmask()</a>)
    223 then you should be able to use the selfpipe in the thread that handles
    224 signals without trouble. Since reading the selfpipe involves waiting for
    225 a file descriptor to become readable, it is recommended to do this in a
    226 thread that will already have a regular input/output loop (via
    227 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html">poll()</a>
    228 or <a href="iopause.html">iopause()</a>) so you can just add the selfpipe
    229 to the list of fds you're reading on. </li>
    230  <li> In rare cases, the self-pipe can theoretically be filled, if some
    231 application sends more than PIPE_BUF signals before you have time to
    232 <tt>selfpipe_read()</tt>. On most Unix systems, PIPE_BUF is 4096,
    233 so it's a very acceptable margin. Unless your code is waiting where
    234 it should not be, only malicious applications will fill the self-pipe
    235 - and malicious applications could just send you a SIGKILL and be done
    236 with you, so this is not a concern. Protect yourself from malicious
    237 applications with clever use of uids. </li>
    238 </ul>
    239 
    240 <h2> Hey, Linux has <a href="https://man7.org/linux/man-pages/man2/signalfd.2.html">signalfd()</a> for this&nbsp;! </h2>
    241 
    242 <p>
    243  Yes, the Linux team loves to gratuitously add new system calls to do
    244 things that could already be done before without much effort. This
    245 adds API complexity, which is not a sign of good engineering.
    246 </p>
    247 
    248 <p>
    249  However, now that <tt>signalfd()</tt> exists, it is indeed marginally more
    250 efficient than a pipe, and it saves one fd: so the selfpipe library
    251 is implemented via <tt>signalfd()</tt> when this call is available.
    252 </p>
    253 
    254 </body>
    255 </html>