Opened 11 years ago
Last modified 10 years ago
#5849 new Bugs
all_to_all() passes invalid args to allocate when sending lots of data
Reported by: | Owned by: | Matthias Troyer | |
---|---|---|---|
Milestone: | To Be Determined | Component: | mpi |
Version: | Boost 1.47.0 | Severity: | Problem |
Keywords: | Cc: | kdavis@… |
Description
For boost 1.47.0 code of the following form fails...
boost::mpi::communicator communicator; std::vector< std::list<Thing*> > in_values(communicator.size()); std::vector< std::list<Thing*> > out_values(communicator.size()); // Point the lists in in_values to more than numeric_limits<int>::max() bytes of data boost::mpi::all_to_all(communicator,in_values,out_values);
Such code ends up passing the allocator, using openmpi-1.4.3, an invalid argument.
Looking at all_to_all.hpp and the method...
template<typename T> void all_to_all_impl(const communicator& comm, const T* in_values, int n, T* out_values, mpl::false_) { ... }
I think I see why this happens. The method begins as follows...
std::vector<int> send_sizes(size); // The displacements for each outgoing value. std::vector<int> send_disps(size); // The buffer that will store all of the outgoing values std::vector<char, allocator<char> > outgoing; // Pack the buffer with all of the outgoing values. for (int dest = 0; dest < size; ++dest) { // Keep track of the displacements send_disps[dest] = outgoing.size(); // Our own value will never be transmitted, so don't pack it. if (dest != rank) { packed_oarchive oa(comm, outgoing); for (int i = 0; i < n; ++i) oa << in_values[dest * n + i]; } // Keep track of the sizes send_sizes[dest] = outgoing.size() - send_disps[dest]; }
If outgoing.size() or (outgoing.size() - send_disps[dest]) is greater than numeric_limits<int>::max(), then send_disps[dest] or send_sizes[dest] flips over to negative values or is simply junk. After that happens the rest of the code in all_to_all_impl() doesn't really make any sense.
My guess is that instead of using int here...
std::vector<int> send_sizes(size); std::vector<int> send_disps(size); std::vector<int> recv_sizes(size); ...
you should use allocator::size_type.