Boost C++ Libraries: Ticket #11968: Undefined behavior in `boost::lockfree::queue`: nodes are always misaligned https://svn.boost.org/trac10/ticket/11968 <p> Here is a minimal reproducer to illustrate the problem: </p> <pre class="wiki">#include &lt;boost/lockfree/queue.hpp&gt; struct entry { int x = -1; }; int main() { boost::lockfree::queue&lt;entry, boost::lockfree::capacity&lt;1&gt;&gt; queue; queue.push({0}); entry d; queue.pop(d); } </pre><pre class="wiki">zaytsev@work:~/src/gcc$ g++ -std=c++14 -fsanitize=undefined lockfree.cpp zaytsev@work:~/src/gcc$ ./a.out /usr/include/boost/lockfree/detail/freelist.hpp:442:9: runtime error: constructor call on misaligned address 0x7fffd0831720 for type 'struct node', which requires 64 byte alignment 0x7fffd0831720: note: pointer points here 00 00 00 00 00 00 a6 5b 2a 7f 00 00 ff ff 00 00 01 00 00 00 40 17 83 d0 ff 7f 00 00 f8 0a 40 00 ^ /usr/include/boost/lockfree/detail/freelist.hpp:454:9: runtime error: constructor call on misaligned address 0x7fffd08316e0 for type 'struct node', which requires 64 byte alignment 0x7fffd08316e0: note: pointer points here ff 7f 00 00 02 00 60 00 00 00 00 00 70 08 40 00 00 00 00 00 88 50 60 00 00 00 00 00 d9 64 a9 5b ^ /usr/include/boost/lockfree/queue.hpp:102:19: runtime error: member access within misaligned address 0x7fffd08316e0 for type 'struct node', which requires 64 byte alignment 0x7fffd08316e0: note: pointer points here ff 7f 00 00 02 00 60 00 00 00 00 00 70 08 40 00 00 00 00 00 88 50 60 00 00 00 00 00 d9 64 a9 5b ^ </pre><p> This happens because nodes are declared to be cache line aligned in <code>queue.hpp</code>: </p> <pre class="wiki">struct BOOST_LOCKFREE_CACHELINE_ALIGNMENT node { ... }; </pre><p> However, alignment to 64 bytes makes node an over-aligned type, and for such types the behavior is implementation defined. In particular, GCC 5.3, which was used for this test, doesn't seem to take alignment requirements of over-aligned types into account when allocating on the heap. </p> <p> Apparently, nobody has noticed this so far, because on x86 this will simply cause a slowdown, but on other targets, such as ARM, the program would be terminated with a <code>SIGBUS</code> if a misaligned read occurs. It could be that the problem reported here: <a class="ext-link" href="https://github.com/boostorg/lockfree/issues/11"><span class="icon">​</span>https://github.com/boostorg/lockfree/issues/11</a> is a manifestation of this larger issue. </p> <p> Now, I would be happy to suggest a fix, but the problem is rather tricky. </p> <ol><li>If the storage for nodes is allocated on the heap, then probably the fix is as simple as replacing the default <code>std::allocator</code> with <code>boost::aligned_allocator</code>. </li></ol><ol start="2"><li>However, what makes <code>boost::lockfree::queue</code> particularly interesting is the support for compile-time sized storage. In this case, the situation is not so simple. One can get aligned storage with something like <code>boost::aligned_storage&lt;size * sizeof(T), BOOST_LOCKFREE_CACHELINE_BYTES&gt; data</code>, but is only part of the solution.<br /><br />Indeed, <code>compiletime_sized_freelist_storage</code> should not only align <code>data</code>, but also over-allocate it, and the node allocator from the <code>fixed_size_freelist</code> should place nodes in <code>data</code> at appropriate positions so that not only the first, but also subsequent nodes remain aligned. </li></ol><ol start="3"><li>In addition to that, <code>BOOST_LOCKFREE_CACHELINE_BYTES</code> should be exposed publicly as a constant and those who are passing custom allocators to the <code>boost::lockfree::queue</code> should use that together with <code>boost::aligned_allocator_adaptor</code> to make sure that their allocators honor the desired alignment. </li></ol><p> So, before investing any time in a patch, the problem has first to be discussed in more details... Maybe it also makes sense to change Boost.Lockfree to use Boost.Align, which was probably not used before because the former predated the latter. </p> <p> For now, the easiest stop gap solution that I can think of would be to patch out the alignment requirement, as it simply doesn't work as expected, but potentially causes problems for other architectures as x86. This would be something I would suggest doing ASAP for Boost 1.61. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/11968 Trac 1.4.3 timblechmann Tue, 09 Feb 2016 17:58:17 GMT <link>https://svn.boost.org/trac10/ticket/11968#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11968#comment:1</guid> <description> <p> interesting analysis. i'll need to have a closer look into it! </p> </description> <category>Ticket</category> </item> <item> <author>yury.zaytsev@…</author> <pubDate>Tue, 09 Aug 2016 09:43:02 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/11968#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11968#comment:2</guid> <description> <p> Hi Tim, </p> <p> This is to let you know that the problem is actually much more serious than I've originally thought: as it turns out, the unfortunate interaction of this bug with atomic libcall emission strategy in clang++-3.8 causes the latter to generate libcalls to <code>libatomic</code> for accesses to the lockfree queues, which is annoying (you have to manually link against <code>libatomic</code>) and might possibly lead to crashes in combination with other voodoo. </p> <p> I was able to reproduce libcall emission in the minimal example above, but not the crashes, however, I've seen the crashes in our production code. They might be caused by the interplay of the general insanity of our code with this particular problem, but if it crashes for us, then it can possibly crash for others. </p> <p> Anyways, my understanding is that as clang sees that your nodes can't possibly be correctly aligned as requested, it errs at the side of caution and generates libcalls to <code>libatomic</code> functions (i.e. <code>__atomic_store_4</code>, etc.) instead sticking to the atomic ops on x86, even though the latter would still work, only with a performance penalty, unlike on some other arches such as ARM. </p> <p> Contrarily to this clang-3.8 behavior, gcc 6.1 generates atomic ops in this case, probably because it knows that they would still work on x86, or maybe I'm overestimating its intelligence and it simply doesn't care, but anyhow our code works just fine. </p> <p> If you want to reproduce it, then compile the minimal example above with gcc &amp; clang against Boost 1.61 and search for (1) libcalls (<code>__atomic_...</code>) and (2) atomic ops (<code>lock ...</code>): </p> <pre class="wiki">{g++-6,clang++-3.8} -S -o lockfree.asm -std=c++14 -O3 -march=native lockfree.cpp </pre><p> Now, the funny thing is that I've discovered that accidentally the following commit fixes the issue for me: </p> <p> <a class="ext-link" href="https://github.com/boostorg/lockfree/commit/72548c26e5243c88660e8c24524616dff3f18164"><span class="icon">​</span>https://github.com/boostorg/lockfree/commit/72548c26e5243c88660e8c24524616dff3f18164</a> </p> <p> Unfortunately, it seems that it's only going to make it in Boost 1.62. However, I have discovered a workaround for Boost 1.61: if one forces the use of Boost.Atomic instead of <code>std::atomic</code>, then <code>clang++-3.8</code> doesn't emit the libcalls anymore and sticks to the atomic ops! </p> <p> P.S. The moral of this story: check your code with anything you might get your hands on (tests, various compilers, static analyzers, instrumentation systems, etc.) and don't ignore the problems that they are reporting! </p> <p> Hope that helps, </p> <p> --Yury. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Tue, 11 Oct 2016 13:56:18 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/11968#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11968#comment:3</guid> <description> <p> Two updates: </p> <ul><li>I'm still getting tons of UBSan unaligned access warnings also on the latest version of Lockfree from git with our production code, although the problem with the original reproducer seems to be gone... maybe worth it letting unit tests run through UBSan or so. </li></ul><ul><li>The segfaults that I've mentioned were unrelated to the Lockfree, just triggered by it, ended up being botched singletons in our code, which weren't single after all. The rest of the analysis is unaffected though. </li></ul> </description> <category>Ticket</category> </item> </channel> </rss>