selfpipe.html (8830B)
1 <html> 2 <head> 3 <meta name="viewport" content="width=device-width, initial-scale=1.0" /> 4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 5 <meta http-equiv="Content-Language" content="en" /> 6 <title>skalibs: the selfpipe library interface</title> 7 <meta name="Description" content="skalibs: the selfpipe library interface" /> 8 <meta name="Keywords" content="skalibs stddjb libstddjb selfpipe self-pipe library interface" /> 9 <!-- <link rel="stylesheet" type="text/css" href="//skarnet.org/default.css" /> --> 10 </head> 11 <body> 12 13 <p> 14 <a href="index.html">libstddjb</a><br /> 15 <a href="../libskarnet.html">libskarnet</a><br /> 16 <a href="../index.html">skalibs</a><br /> 17 <a href="//skarnet.org/software/">Software</a><br /> 18 <a href="//skarnet.org/">skarnet.org</a> 19 </p> 20 21 <h1> The <tt>selfpipe</tt> library interface </h1> 22 23 <p> 24 The selfpipe functions are declared in the 25 <tt>skalibs/selfpipe.h</tt> header and implemented in the <tt>libskarnet.a</tt> 26 or <tt>libskarnet.so</tt> library. 27 </p> 28 29 <h2> What does it do ? </h2> 30 31 <p> 32 Signal handlers suck. 33 </p> 34 35 <p> 36 They do. I don't care how experienced you are with C/Unix programming, 37 they do. You can be Ken Thompson, if you use signal handlers as a 38 regular part of your C programming model, you <em>are</em> going to 39 screw up, and write buggy code. 40 </p> 41 42 <p> 43 Unix is tricky enough with interruptions. Even when you have a single 44 thread, signals can make the execution flow very non-intuitive. 45 They mess up the logic of linear and structured code, 46 they introduce non-determinism; you always have to think "and what 47 if I get interrupted here and the flow goes into a handler...". This 48 is annoying. 49 </p> 50 51 <p> 52 Moreover, signal handler code is <em>very</em> limited in what it can 53 do. It can't use any non-reentrant function! If you call a non-reentrant 54 function, and by chance you were precisely in that non-reentrant function 55 code when you got interrupted by a signal... you lose. That means, no 56 malloc(). No bufferized IO. No globals. The list goes on and on. <br /> 57 If you're going to catch signals, you'll want to handle them <em>outside</em> 58 the signal handler. You actually want to spend <em>the least possible 59 time</em> inside a signal handler - just enough to notify your main 60 execution flow that there's a signal to take care of. 61 </p> 62 63 <p> 64 And, of course, signal handlers don't mix with event loops, which is 65 a classic source of headaches for programmers and led to the birth of 66 abominations such as 67 <a href="https://www.opengroup.org/onlinepubs/009695399/functions/pselect.html"> 68 pselect</a>. So much for the "everything is a file" concept that Unix was 69 built on. 70 </p> 71 72 <p> 73 A signal should be an event like any other. 74 There should be a unified interface - receiving a signal should make some 75 fd readable or something. 76 </p> 77 78 <p> 79 And that's exactly what the 80 <a href="https://cr.yp.to/docs/selfpipe.html">self-pipe trick</a>, invented 81 by <a href="../djblegacy.html">DJB</a>, does. 82 </p> 83 84 <p> 85 As long as you're in some kind of event loop, the self-pipe trick allows 86 you to forget about signal handlers... <em>forever</em>. It works this way: 87 </p> 88 89 <ol> 90 <li> Create a pipe <tt>p</tt>. Make both ends close-on-exec and nonblocking. </li> 91 <li> Write a tiny signal handler ("top half") for all the signals you want to 92 catch. This 93 signal handler should just write one byte into <tt>p[1]</tt>, and do nothing 94 more; ideally, the written byte identifies the signal. </li> 95 <li> In your event loop, add <tt>p[0]</tt> to the list of fds you're watching 96 for readability. </li> 97 </ol> 98 99 <p> 100 When you get a signal, a byte will be written to the self-pipe, and your 101 execution flow will resume. When you next go through the event loop, 102 <tt>p[0]</tt> will be readable; you'll then be able to read a byte from 103 it, identify the signal, and handle it - in your unrestricted main 104 environment (the "bottom half" of the handler). 105 </p> 106 107 <p> 108 The selfpipe library does it all for you - you don't even have to write 109 the top half yourself. You can forget their existence and recover 110 some peace of mind. 111 </p> 112 113 <p> 114 Note that in an asynchronous event loop, you need to protect your 115 system calls against EINTR by using <a href="safewrappers.html">safe 116 wrappers</a>. 117 </p> 118 119 <h2> How do I use it ? </h2> 120 121 <h3> Starting </h3> 122 123 <pre> 124 int fd = selfpipe_init() ; 125 </pre> 126 127 <p> 128 <tt>selfpipe_init()</tt> sets up a selfpipe. You must use that 129 function first. <br /> 130 If <tt>fd</tt> is -1, then an error occurred. Else <tt>fd</tt> is a 131 non-blocking descriptor that can be used in your event loop. It will 132 be selected for readability when you've caught a signal. 133 </p> 134 135 <h3> Trapping signals </h3> 136 137 <pre> 138 int r = selfpipe_trap(SIGTERM) ; 139 </pre> 140 141 <p> 142 <tt>selfpipe_trap()</tt> catches a signal and sends it to the selfpipe. 143 Uncaught signals won't trigger the selfpipe. <tt>r</tt> is 1 if 144 the operation succeeded, and 0 if it failed. If it succeeded, you 145 can forget about the trapped signal entirely. <br /> 146 In our example, if <tt>r</tt> is 1, then a SIGTERM will instantly 147 trigger readability on <tt>fd</tt>. 148 </p> 149 150 <pre> 151 int r ; 152 sigset_t set ; 153 sigemptyset(&set) ; 154 sigaddset(&set, SIGTERM) ; 155 sigaddset(&set, SIGHUP) ; 156 r = selfpipe_trapset(&set) ; 157 </pre> 158 159 <p> 160 <tt>selfpipe_trap()</tt> handles signals one 161 by one. Alternatively (and often preferrably), you can use 162 <tt>selfpipe_trapset()</tt> to directly handle signal sets. When you call 163 <tt>selfpipe_trapset()</tt>, signals that are present in <tt>set</tt> will 164 be caught by the selfpipe, and signals that are absent from <tt>set</tt> 165 will be uncaught. <tt>r</tt> is 1 if the operation succeeded and 0 if it 166 failed. 167 </p> 168 169 <h3> Handling events </h3> 170 171 <pre> 172 int c = selfpipe_read() ; 173 </pre> 174 175 <p> 176 Call <tt>selfpipe_read()</tt> when your <tt>fd</tt> is readable. 177 That's where you write your <em>real</em> signal handler: in the 178 body of your event loop, in a "normal" context. <br /> 179 <tt>c</tt> is -1 if an error occurred - in which case chances are 180 it's a serious one and your system has become very unstable. 181 <tt>c</tt> is 0 if there are no more pending signals. If <tt>c</tt> 182 is positive, it is the number of the signal that was caught. 183 </p> 184 185 <h3> Accessing the selfpipe </h3> 186 187 <pre> 188 int fd = selfpipe_fd() ; 189 </pre> 190 191 <p> 192 Sometimes you need to access the fd of the selfpipe in two 193 very distinct translation units (typically to poll on it), and you 194 rightly don't want to add a global variable to store it, especially 195 since it's already stored in a global internal variable in skalibs. 196 No need to bloat your binary anymore: <tt>selfpipe_fd()</tt> will 197 now retrieve the value for you, wherever you are. 198 </p> 199 200 <h3> Finishing </h3> 201 202 <pre> 203 selfpipe_finish() ; 204 </pre> 205 206 <p> 207 Call <tt>selfpipe_finish()</tt> when you're done using the selfpipe. 208 Signal handlers will be restored to SIG_DFL, i.e. signals will not 209 be trapped anymore. 210 </p> 211 212 <h2> Any limitations ? </h2> 213 214 <p> 215 Some, as always. 216 </p> 217 218 <ul> 219 <li> The selfpipe library uses a global pipe; 220 so, it's theoretically not safe for multithreading. However, as long as you dedicate 221 one thread to signal handling and block signals in all the other threads 222 (see <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_sigmask.html">pthread_sigmask()</a>) 223 then you should be able to use the selfpipe in the thread that handles 224 signals without trouble. Since reading the selfpipe involves waiting for 225 a file descriptor to become readable, it is recommended to do this in a 226 thread that will already have a regular input/output loop (via 227 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html">poll()</a> 228 or <a href="iopause.html">iopause()</a>) so you can just add the selfpipe 229 to the list of fds you're reading on. </li> 230 <li> In rare cases, the self-pipe can theoretically be filled, if some 231 application sends more than PIPE_BUF signals before you have time to 232 <tt>selfpipe_read()</tt>. On most Unix systems, PIPE_BUF is 4096, 233 so it's a very acceptable margin. Unless your code is waiting where 234 it should not be, only malicious applications will fill the self-pipe 235 - and malicious applications could just send you a SIGKILL and be done 236 with you, so this is not a concern. Protect yourself from malicious 237 applications with clever use of uids. </li> 238 </ul> 239 240 <h2> Hey, Linux has <a href="https://man7.org/linux/man-pages/man2/signalfd.2.html">signalfd()</a> for this ! </h2> 241 242 <p> 243 Yes, the Linux team loves to gratuitously add new system calls to do 244 things that could already be done before without much effort. This 245 adds API complexity, which is not a sign of good engineering. 246 </p> 247 248 <p> 249 However, now that <tt>signalfd()</tt> exists, it is indeed marginally more 250 efficient than a pipe, and it saves one fd: so the selfpipe library 251 is implemented via <tt>signalfd()</tt> when this call is available. 252 </p> 253 254 </body> 255 </html>