Boost C++ Libraries: Ticket #11777: Doing async reads appears to do spurrious calls to recvmsg that return EAGAIN https://svn.boost.org/trac10/ticket/11777 <p> In running an strace on a program that runs many threads, each controlling many streams, we see the following pattern when looking at one of the threads: </p> <p> epoll_wait(31, ..., 128, -1) = 46 recvmsg(4114, ..., 0) = 2892 recvmsg(4114, ..., 0) = -1 EAGAIN recvmsg(3700, ..., 0) = 16768 recvmsg(3700, ..., 0) = -1 EAGAIN and so on for all 46 sockets. </p> <p> When doing an strace -c on the process, where the bulk (90+%) of the threads are related to these sockets, I see 60% of the time in 27624 calls to write (we read data, process it, and write it to another socket), 34% for 52153 calls to recvmsg (of which 24526 are ERRORS), and 6% are in 4668 calls to epoll_wait. </p> <p> We are noticing that the bulk of the applications time is system time when monitoring with the times() system call. Calling system calls that return immediately with errors is a good way to cause this, especially if there are buffers that are mapped or tested for validity before the EAGAIN check is done. </p> <p> Looking at the release notes for subsequent versions doesn't indicate a fix towards this issue. </p> <p> Unfortunately, the programs are proprietary, and the data streams are proprietary streams on our customers' networks. </p> <p> If the epoll uses edge triggered events, you will get a notification if the first read was a partial, so you don't need to do the extra read. </p> <p> Even though this is an optimization, this is causing us a major performance issue on a large number of machines, thus, the Showstopper severity. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/11777 Trac 1.4.3