Opened 9 years ago

Closed 9 years ago

#8613 closed Bugs (fixed)

[Windows] boost::asio::ip::tcp::socket::async_write(boost::asio::null_buffers(), ...) WriteHandler gets a boost::system::error_code with a NULL category pointer on success

Reported by: Segev Finer <segev208@…> Owned by: chris_kohlhoff
Milestone: To Be Determined Component: asio
Version: Boost 1.53.0 Severity: Problem
Keywords: Cc:

Description

The boost::asio::ip::tcp::socket::async_write() WriteHandler gets a boost::system::error_code with a NULL category pointer on success.

This is essentially an invalid boost::system::error_code that will cause an access violation when you try to print it has the code doesn't expect the category pointer to be NULL.

boost::system::error_code is printed like so:

os << ec.category().name() << ':' << ec.value();

Which throws an access violation when ec.category() tries to dereference the unexpected NULL category pointer which shouldn't happen.

Code that reproduces this problem:

#include <iostream>
#include <boost/asio.hpp>
#include <boost/bind.hpp>

using boost::asio::ip::tcp;

void write_handler(const boost::system::error_code & ec)
{
        std::cout << ec << std::endl;
}

int main(void)
{
        boost::asio::io_service io_service;

        tcp::resolver resolver(io_service);
        tcp::resolver::query query("www.google.com", "80");
        tcp::resolver::iterator endpoint_iterator = resolver.resolve(query);

        boost::asio::ip::tcp::socket socket(io_service);
        boost::asio::connect(socket, endpoint_iterator);

        socket.async_write_some(boost::asio::null_buffers(), boost::bind(write_handler, boost::asio::placeholders::error));

        io_service.run();
}

Change History (7)

comment:1 by chris_kohlhoff, 9 years ago

Resolution: worksforme
Status: newclosed

Works for me. Most likely you have a mismatch in your build options somewhere, e.g. between the boost.system library and your program.

comment:2 by Segev Finer <segev208@…>, 9 years ago

Resolution: worksforme
Status: closedreopened

Well it definitly doesn't for me... quite consistently... and it's definitly Windows only (as the tag in the ticket name says)

Also I'm compiling boost under it's default settings and didn't even change any project settings when I tested this in isolation.

heck boost::system works perfectly fine in any other case, it's only when I do a:

socket.async_write_some(boost::asio::null_buffer(), ...)

That this issue manifests, with a boost::asio::buffer() it works.

I tried debugging this and I think I can see the problem too.

async_write_some()/async_send with null_buffer() uses the select_reactor but from what I know and from looking at the code the select() function doesn't use the OVERLAPPED and as such it also doesn't use the embedded OVERLAPPED struct in the win_iocp_operation to store it's result, But when it does post_deffered_completions() on the win_iocp_io_service the service does PostQueuedCompletionStatus() with overlapped_contains_result.

This in turn leads to win_iocp_io_service::do_one() using the embedded OVERLAPPED struct in the operation and since it isn't used by the select reactor is filled with zeroes and as such a NULL error_category pointer.

There is even more: win_iocp_null_buffers_op checks this in it's on_completion():

// The reactor may have stored a result in the operation object.
    if (o->ec_)
      ec = o->ec_;

This means if the reactor has saved a failure error_code in the operation it will use it instead of the one passed as an argument, BUT the operation does succeed in our case so it uses the error_code passed as an argument which has a NULL error_category pointer.

And here is some additional info about the system I'm experiencing this on, maybe it's also dependent on this: Windows 7 SP1 x64 Built boost and the test code under MSVC 2010 at Debug|Win32 (32-bit build)

Can you please look farther into this?

comment:3 by Segev Finer <segev208@…>, 9 years ago

Summary: [Windows] boost::asio::ip::tcp::socket::async_write() WriteHandler gets a boost::system::error_code with a NULL category pointer on success[Windows] boost::asio::ip::tcp::socket::async_write(boost::asio::null_buffers(), ...) WriteHandler gets a boost::system::error_code with a NULL category pointer on success

comment:4 by chris_kohlhoff, 9 years ago

Thank you for taking the time to do a proper analysis. Can you please confirm whether the following diff corrects the issue for you:

@ -235,8 +235,7 @@ void win_iocp_io_service::post_deferred_completion(win_iocp_operation* op)
   op->ready_ = 1;
 
   // Enqueue the operation on the I/O completion port.
-  if (!::PostQueuedCompletionStatus(iocp_.handle,
-        0, overlapped_contains_result, op))
+  if (!::PostQueuedCompletionStatus(iocp_.handle, 0, 0, op))
   {
     // Out of resources. Put on completed queue instead.
     mutex::scoped_lock lock(dispatch_mutex_);
@@ -256,8 +255,7 @@ void win_iocp_io_service::post_deferred_completions(
     op->ready_ = 1;
 
     // Enqueue the operation on the I/O completion port.
-    if (!::PostQueuedCompletionStatus(iocp_.handle,
-          0, overlapped_contains_result, op))
+    if (!::PostQueuedCompletionStatus(iocp_.handle, 0, 0, op))
     {
       // Out of resources. Put on completed queue instead.
       mutex::scoped_lock lock(dispatch_mutex_);
@@ -284,8 +282,7 @@ void win_iocp_io_service::on_pending(win_iocp_operation* op)
   if (::InterlockedCompareExchange(&op->ready_, 1, 0) == 1)
   {
     // Enqueue the operation on the I/O completion port.
-    if (!::PostQueuedCompletionStatus(iocp_.handle,
-          0, overlapped_contains_result, op))
+    if (!::PostQueuedCompletionStatus(iocp_.handle, 0, 0, op))
     {
       // Out of resources. Put on completed queue instead.
       mutex::scoped_lock lock(dispatch_mutex_);

comment:5 by Segev Finer <segev208@…>, 9 years ago

The fix you posted does seem to work.

Altough I think it's possible that this issue exists in the first place, since some other part of the code depended on overlapped_contains_result being passed and as such any code that uses this functions should be reviewed if it requires it or not. a quick search revelead that there aren't that many (Well at least ones that are actually relevant to win_iocp)

For example: win_iocp_overlapped_ptr::release() calls on_pending(), and maybe it needs overlapped_contains_result and maybe it doesn't, I can't really tell from just reading the code. But maybe you already did this?...

Setting the correct argument there is the best fix for this, anyhow IMHO.

Thanks.

comment:6 by chris_kohlhoff, 9 years ago

(In [84487]) Fix bug on Windows where certain operations might generate an error_code with an invalid (i.e. NULL) error_category. Refs #8613

comment:7 by chris_kohlhoff, 9 years ago

Resolution: fixed
Status: reopenedclosed

(In [84530]) Merge from trunk. Fixes #8421, #8602, #7739, #8613, #7939.


r84482 | chris_kohlhoff | 2013-05-25 21:35:10 +1000 (Sat, 25 May 2013) | 1 line

Fix potential deadlock in signal_set implementation.


r84483 | chris_kohlhoff | 2013-05-25 21:38:01 +1000 (Sat, 25 May 2013) | 1 line

Fix error in acceptor example. Refs #8421


r84484 | chris_kohlhoff | 2013-05-25 21:41:19 +1000 (Sat, 25 May 2013) | 1 line

Fix waitable timer documentation. Refs #8602


r84485 | chris_kohlhoff | 2013-05-25 21:46:20 +1000 (Sat, 25 May 2013) | 1 line

Add assertions that num_buckets_ is non-zero. Refs #7739


r84486 | chris_kohlhoff | 2013-05-25 21:50:52 +1000 (Sat, 25 May 2013) | 8 lines

Automatically disable SSL compression.

To mitigate the risk of certain attacks, SSL compression is now disabled by default. To enable, you can use the new ssl::context::clear_options() function like so:

my_context.clear_options(asio::ssl::context::no_compression);


r84487 | chris_kohlhoff | 2013-05-25 21:52:54 +1000 (Sat, 25 May 2013) | 3 lines

Fix bug on Windows where certain operations might generate an error_code with an invalid (i.e. NULL) error_category. Refs #8613


r84488 | chris_kohlhoff | 2013-05-25 21:57:36 +1000 (Sat, 25 May 2013) | 1 line

Fix problem in #warning directive. Refs #7939


r84492 | chris_kohlhoff | 2013-05-25 22:35:43 +1000 (Sat, 25 May 2013) | 2 lines

Fix potential data race due to reading the reactor pointer outside the lock.


r84494 | chris_kohlhoff | 2013-05-25 23:03:48 +1000 (Sat, 25 May 2013) | 1 line

Regenerate documentation.


r84529 | chris_kohlhoff | 2013-05-27 22:17:19 +1000 (Mon, 27 May 2013) | 1 line

Add documentation for new features.


Note: See TracTickets for help on using tickets.