Opened 7 years ago

Last modified 6 years ago

#11968 new Bugs

Undefined behavior in `boost::lockfree::queue`: nodes are always misaligned

Reported by: yury.zaytsev@… Owned by: timblechmann
Milestone: To Be Determined Component: lockfree
Version: Boost Development Trunk Severity: Showstopper
Keywords: Cc:

Description

Here is a minimal reproducer to illustrate the problem:

#include <boost/lockfree/queue.hpp>

struct entry {
    int x = -1;
};

int main() {

    boost::lockfree::queue<entry, boost::lockfree::capacity<1>> queue;

    queue.push({0});

    entry d;
    queue.pop(d);

}
zaytsev@work:~/src/gcc$ g++ -std=c++14 -fsanitize=undefined lockfree.cpp 
zaytsev@work:~/src/gcc$ ./a.out 
/usr/include/boost/lockfree/detail/freelist.hpp:442:9: runtime error: constructor call on misaligned address 0x7fffd0831720 for type 'struct node', which requires 64 byte alignment
0x7fffd0831720: note: pointer points here
 00 00 00 00  00 00 a6 5b 2a 7f 00 00  ff ff 00 00 01 00 00 00  40 17 83 d0 ff 7f 00 00  f8 0a 40 00
              ^ 
/usr/include/boost/lockfree/detail/freelist.hpp:454:9: runtime error: constructor call on misaligned address 0x7fffd08316e0 for type 'struct node', which requires 64 byte alignment
0x7fffd08316e0: note: pointer points here
 ff 7f 00 00  02 00 60 00 00 00 00 00  70 08 40 00 00 00 00 00  88 50 60 00 00 00 00 00  d9 64 a9 5b
              ^ 
/usr/include/boost/lockfree/queue.hpp:102:19: runtime error: member access within misaligned address 0x7fffd08316e0 for type 'struct node', which requires 64 byte alignment
0x7fffd08316e0: note: pointer points here
 ff 7f 00 00  02 00 60 00 00 00 00 00  70 08 40 00 00 00 00 00  88 50 60 00 00 00 00 00  d9 64 a9 5b
              ^ 

This happens because nodes are declared to be cache line aligned in queue.hpp:

struct BOOST_LOCKFREE_CACHELINE_ALIGNMENT node { ... };

However, alignment to 64 bytes makes node an over-aligned type, and for such types the behavior is implementation defined. In particular, GCC 5.3, which was used for this test, doesn't seem to take alignment requirements of over-aligned types into account when allocating on the heap.

Apparently, nobody has noticed this so far, because on x86 this will simply cause a slowdown, but on other targets, such as ARM, the program would be terminated with a SIGBUS if a misaligned read occurs. It could be that the problem reported here: https://github.com/boostorg/lockfree/issues/11 is a manifestation of this larger issue.

Now, I would be happy to suggest a fix, but the problem is rather tricky.

  1. If the storage for nodes is allocated on the heap, then probably the fix is as simple as replacing the default std::allocator with boost::aligned_allocator.
  1. However, what makes boost::lockfree::queue particularly interesting is the support for compile-time sized storage. In this case, the situation is not so simple. One can get aligned storage with something like boost::aligned_storage<size * sizeof(T), BOOST_LOCKFREE_CACHELINE_BYTES> data, but is only part of the solution.

    Indeed, compiletime_sized_freelist_storage should not only align data, but also over-allocate it, and the node allocator from the fixed_size_freelist should place nodes in data at appropriate positions so that not only the first, but also subsequent nodes remain aligned.
  1. In addition to that, BOOST_LOCKFREE_CACHELINE_BYTES should be exposed publicly as a constant and those who are passing custom allocators to the boost::lockfree::queue should use that together with boost::aligned_allocator_adaptor to make sure that their allocators honor the desired alignment.

So, before investing any time in a patch, the problem has first to be discussed in more details... Maybe it also makes sense to change Boost.Lockfree to use Boost.Align, which was probably not used before because the former predated the latter.

For now, the easiest stop gap solution that I can think of would be to patch out the alignment requirement, as it simply doesn't work as expected, but potentially causes problems for other architectures as x86. This would be something I would suggest doing ASAP for Boost 1.61.

Change History (3)

comment:1 by timblechmann, 7 years ago

interesting analysis. i'll need to have a closer look into it!

comment:2 by yury.zaytsev@…, 6 years ago

Hi Tim,

This is to let you know that the problem is actually much more serious than I've originally thought: as it turns out, the unfortunate interaction of this bug with atomic libcall emission strategy in clang++-3.8 causes the latter to generate libcalls to libatomic for accesses to the lockfree queues, which is annoying (you have to manually link against libatomic) and might possibly lead to crashes in combination with other voodoo.

I was able to reproduce libcall emission in the minimal example above, but not the crashes, however, I've seen the crashes in our production code. They might be caused by the interplay of the general insanity of our code with this particular problem, but if it crashes for us, then it can possibly crash for others.

Anyways, my understanding is that as clang sees that your nodes can't possibly be correctly aligned as requested, it errs at the side of caution and generates libcalls to libatomic functions (i.e. __atomic_store_4, etc.) instead sticking to the atomic ops on x86, even though the latter would still work, only with a performance penalty, unlike on some other arches such as ARM.

Contrarily to this clang-3.8 behavior, gcc 6.1 generates atomic ops in this case, probably because it knows that they would still work on x86, or maybe I'm overestimating its intelligence and it simply doesn't care, but anyhow our code works just fine.

If you want to reproduce it, then compile the minimal example above with gcc & clang against Boost 1.61 and search for (1) libcalls (__atomic_...) and (2) atomic ops (lock ...):

{g++-6,clang++-3.8} -S -o lockfree.asm -std=c++14 -O3 -march=native lockfree.cpp

Now, the funny thing is that I've discovered that accidentally the following commit fixes the issue for me:

https://github.com/boostorg/lockfree/commit/72548c26e5243c88660e8c24524616dff3f18164

Unfortunately, it seems that it's only going to make it in Boost 1.62. However, I have discovered a workaround for Boost 1.61: if one forces the use of Boost.Atomic instead of std::atomic, then clang++-3.8 doesn't emit the libcalls anymore and sticks to the atomic ops!

P.S. The moral of this story: check your code with anything you might get your hands on (tests, various compilers, static analyzers, instrumentation systems, etc.) and don't ignore the problems that they are reporting!

Hope that helps,

--Yury.

comment:3 by anonymous, 6 years ago

Two updates:

  • I'm still getting tons of UBSan unaligned access warnings also on the latest version of Lockfree from git with our production code, although the problem with the original reproducer seems to be gone... maybe worth it letting unit tests run through UBSan or so.
  • The segfaults that I've mentioned were unrelated to the Lockfree, just triggered by it, ended up being botched singletons in our code, which weren't single after all. The rest of the analysis is unaffected though.
Note: See TracTickets for help on using tickets.