Opened 5 years ago

#13319 new Bugs

Boost.Asio.SSL write_some cause 'decryption failed or bad record mac' on large (~1MB) transmissions

Reported by: Andrey Borisov <andrey.borisov@…> Owned by: chris_kohlhoff
Milestone: To Be Determined Component: asio
Version: Boost 1.65.0 Severity: Problem
Keywords: asio ssl write_some Cc:

Description

When we send large amount of data (1MB) using boost::asio::ssl::stream the boost closes the connection after some transmission and reports encryption error : 'decryption failed or bad record mac'.

The reason of it is a write_some function.

Shortly it uses boost::asio::write to send encrypted data to TCP socket (boost_1_57_0\boost\asio\ssl\detail\io.hpp:60, see details below), but the amount of really sent data is ignored. Instead of it, the amount of handled by OpenSSL user application buffer is returned to the user application. In case of 'would block' operation the amount of really sent data is less than reported to user application.

Details: If the socket is unblocking, the write_some has to try to send something over the socket and has to return the amount of the data it is able to send. The application has to retry next time for the rest of the output data, taking in account the amount of just sent data.

The SSL implementation uses paired BIO of OpenSSL to allow sending encrypted data not directly to a TCP socket but in a memory buffers. The Boost uses this buffers to integrate SSL in its engine implementing asynchronous socket operation.

In other words, boost internally works with TCP sockets using its mechanisms of direct or asynchronous data operation getting data from OpenSSL buffer.

For the write operation the sequence is the following:

  1. User application calls boost::asio::ssl::stream::write_some operation and gives it the buffer with initial data.
  2. This operation calls the engine detail::io function. This function has to encrypt data using OpenSSL, read the encrypted data from the coupled OpenSSL BIO buffer and then send it out using TCP socket.
  3. First the engine io function calls the detail::write_op. This operation (in subroutines) calls the ::SSL_write. This function uses a BIO buffer to store encrypted data and returns the amount of handled initial data. The current buffer size is 17KB. In case of large outgoing message, the size will be about 16+KB.
  4. The size of the handled initial data is stored to bytes_transfered and then is used as a return value of sent amount to the user application.
  5. After this operation the SSL_get_error returns what OpenSSL wants – send the encrypted data through real socket or read raw data from the socket.
  6. In our case normally the Open SSL just want to send the portion of encrypted data.
  7. The detail::io execute the case engine::want_output:. It calls boost::asio::write(next_layer, core.engine_.get_output(core.output_buffer_), ec);
  8. This function read the encrypted data from OpenSSL buffer and transmit it using usual boost write operation.
  9. The problem is that the return of this function is not checked. The amount of the send data by this function can be less than required (16KB). But the user application is notified that that write_some function have sent 16+K.
  10. This leads to the unsynchronized state of SSL on both sides. The sender consider it sends +16KB, but receiver has gotten say only 4KB. Other data are lost.
  11. The user application makes next send skipping the 16KB. And after some send operation OpenSSL recognize the error state. It reports the error and boost closes the socket.
  12. The reason of why write operation can send not the entire encrypted portion (16KB) is the following:
  13. It tries sending the full buffer, but in case of large data stream the corresponding socket can get stacked, and the output buffer can overflow.
  14. In this case the socket operation has the following options
  15. Block in send operation in case of blocking socked
  16. Return “would block” error in case of nonblocking socket.
  17. In our case we have nonblocking socket and write operation stops execution in this case and returns the amount of really sent data. But this value is just ignored.

Workaround: don't use write_some with ssl.

Change History (0)

Note: See TracTickets for help on using tickets.