Opened 8 years ago

Last modified 6 years ago

#10956 new Bugs

null point exception using asio based on linux

Reported by: xuzhiteng <419855192@…> Owned by: chris_kohlhoff
Milestone: To Be Determined Component: asio
Version: Boost 1.56.0 Severity: Problem
Keywords: Cc:

Description

I have writen a linux server application based on centos boost asio.I have created thread pool accelerating asyn asio,but it is cored down like following. #0 start_op (this=0x1196e18, impl=..., op_type=1, op=0x7fd2b01a8c30, is_continuation=<value optimized out>, is_non_blocking=true, noop=false)

at /usr/share/server_depends/cmake/depends/net/../../../depends/net/../boost_1_56_0/boost/asio/detail/impl/epoll_reactor.ipp:219

#1 boost::asio::detail::reactive_socket_service_base::start_op (this=0x1196e18, impl=..., op_type=1, op=0x7fd2b01a8c30, is_continuation=<value optimized out>, is_non_blocking=true, noop=false)

at /usr/share/server_depends/cmake/depends/net/../../../depends/net/../boost_1_56_0/boost/asio/detail/impl/reactive_socket_service_base.ipp:214

#2 0x00007fd2b94dc667 in async_send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, asio_net::shared_const_buffer>, boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, asio_net::shared_const_buffer, boost::asio::detail::transfer_all_t, boost::_bi::bind_t<void, boost::_mfi::mf2<void, asio_net::AsioSocket, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<asio_net::AsioSocket*>, boost::arg<1>, boost::arg<2> > > > > (this=<value optimized out>, ec=<value optimized out>, bytes_transferred=<value optimized out>, start=<value optimized out>)

at /usr/share/server_depends/cmake/depends/net/../../../depends/net/../boost_1_56_0/boost/asio/detail/reactive_socket_service_base.hpp:216

#3 async_send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, asio_net::shared_const_buffer>, boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, asio_net::shared_const_buffer, boost::asio::detail::transfer_all_t, boost::_bi::bind_t<void, boost::_mfi::mf2<void, asio_net::AsioSocket, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<asio_net::AsioSocket*>, boost::arg<1>, boost::arg<2> > > > > (this=<value optimized out>, ec=<value optimized out>, bytes_transferred=<value optimized out>, start=<value optimized out>)

at /usr/share/server_depends/cmake/depends/net/../../../depends/net/../boost_1_56_0/boost/asio/stream_socket_service.hpp:330

#4 async_write_some<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, asio_net::shared_const_buffer>, boost::asio::detail::write_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, asio_net::shared_const_buffer, boost::asio::detail::transfer_all_t, boost::_bi::bind_t<void, boost::_mfi::mf2<void, asio_net::AsioSocket, boost::system::error_code const&, unsigned long>, boost::_bi::list3<boost::_bi::value<asio_net::AsioSocket*>, boost::arg<1>, boost::arg<2> > > > > (this=<value optimized out>, ec=<value optimized out>, bytes_transferred=<value optimized out>, start=<value optimized out>)

at /usr/share/server_depends/cmake/depends/net/../../../depends/net/../boost_1_56_0/boost/asio/basic_stream_socket.hpp:732

I guess that the socket writing thread starts a reactor io when the other thread closes it concurrency.Is that a bug?Does the socket closing function should add some lock to avoid it.

Change History (4)

comment:1 by viboes, 8 years ago

Component: Noneasio
Owner: set to chris_kohlhoff

comment:2 by mike.sampson@…, 7 years ago

I just ran into this same issue today. From what I can tell, there is a race condition between epoll_reactor::deregister_descriptor/epoll_reactor::deregister_internal_descriptor and the epoll_reactor::start_op and epoll_reactor::cancel_ops routines.

Both start_op and cancel_op check the descriptor_data for NULL before proceeding; however, this logic is not enough. If the descriptor_data is not NULL when the check is made and the code blocks while trying to acquire the descriptor_data->mutex, there is no guarantee the object is valid after acquiring the lock. Both deregister_xxx routines release the lock just before deleting and Null-ing out the descriptor_data.

A segmentation fault occurs when trying to dereference descriptor_data. For example, when checking descriptor_data->shutdown in start_ops.

comment:3 by mike.sampson@…, 7 years ago

After investigating further, this issue was already reported and marked "ignore" due to the fact that there is no thread-safety guarantee when closing a socket from one thread while issuing a read/write operation from another. I am adjusting my code to mutex protect these operations myself.

comment:4 by jesse.pepper@…, 6 years ago

Can you point at the original report that was marked ignore?

Note: See TracTickets for help on using tickets.