Opened 11 years ago
Closed 9 years ago
#5596 closed Bugs (fixed)
MPI: problem creating communicator
Reported by: | Owned by: | Matthias Troyer | |
---|---|---|---|
Milestone: | To Be Determined | Component: | mpi |
Version: | Boost 1.42.0 | Severity: | Problem |
Keywords: | Cc: |
Description
Where I create a communicator from a group, the program utilizes the CPU fully, and the code doesn't create the communicator. I'm attaching a simple example.
Attachments (1)
Change History (13)
by , 11 years ago
comment:2 by , 11 years ago
Hello!
I experience the same issue with OpenMPI 1.4.2, gcc 4.4.5 I'm still at very beginning of parallel programming, so I may be talking nonsense from time to time (if yes please correct me). I hope that discussion will attract some more professional users that will give better answer for this question.
So far I consider 3 possibilities:
- incompatibility between
std:iterator from v.begin() and v.end()
as input parameter for
template<typename InputIterator> group include(InputIterator first, InputIterator last);
can be false because for vector of given length
std::vector<int> v(x); g.size();
gives number equal to vector length = x
- Some copy constructor/pointers issue
This is pretty much blind suspect after reading (problem with posting link to article)
- Wrong parentheses along with process number that performs the operation of group creation
I have to investigate the thing further by looking inside specific implementations. I hope to answer again in next few days.
follow-ups: 5 6 comment:3 by , 11 years ago
It's strange but dynamically created communicator work fine.
#include <vector> #include <boost/mpi.hpp> namespace mpi = boost::mpi; int main(int argc, char argv[]) { mpi::environment env(argc, argv); mpi::communicator world, c; std::vector<int> v(1); mpi::group wg = world.group(); mpi::group g = wg.include(v.begin(), v.end()); c = new mpi::communicator(world, g); if (!world.rank()){ std::cout << "v.size : " << v.size() << std::endl; std::cout << "wg.size : " << wg.size() << std::endl; std::cout << "g.size : " << g.size() << std::endl; std::cout << "c.size : " << c->size() << std::endl; } return 0; }
comment:5 by , 11 years ago
Replying to monika.cienkus@…:
It's strange but dynamically created communicator work fine.
#include <vector> #include <boost/mpi.hpp> namespace mpi = boost::mpi; int main(int argc, char argv[])
Hello!
I have noticed as well you left out clause:
if (!world.rank())
That could confirm my 3rd suspicion, but... more about my opinion below
You missed * in:
int main(int argc, char * argv[])
After that your code compiles well.
Could you show me example use of the c communicator? I can't seem to get it to work so far. Something as simple as:
if (c->rank() == 0)
gives me clone_impl exception in addition to the same pionter exceptions I get by simply dropping clause:
if (!world.rank())
in the example provided by the author of the ticket. It seems that using pointer to c is only delaying exceptions to the moment of the using it!
I think what are we doing here is simply redefinition of c but done by different threads, that's why I believe dropping clause will not help at all, because assigning processes to the communicator should be done by single process.
Please elaborate.
comment:6 by , 11 years ago
I tested your solution and it doesn't resolve the problem. The program still utilizes the CPU 100%, and doesn't finish. So creating the communicator dynamically doesn't make a difference.
Replying to monika.cienkus@…:
It's strange but dynamically created communicator work fine.
#include <vector> #include <boost/mpi.hpp> namespace mpi = boost::mpi; int main(int argc, char argv[]) { mpi::environment env(argc, argv); mpi::communicator world, c; std::vector<int> v(1); mpi::group wg = world.group(); mpi::group g = wg.include(v.begin(), v.end()); c = new mpi::communicator(world, g); if (!world.rank()){ std::cout << "v.size : " << v.size() << std::endl; std::cout << "wg.size : " << wg.size() << std::endl; std::cout << "g.size : " << g.size() << std::endl; std::cout << "c.size : " << c->size() << std::endl; } return 0; }
follow-up: 8 comment:7 by , 11 years ago
#include <vector> #include <boost/mpi.hpp> namespace mpi = boost::mpi; int main(int argc, char** argv) { mpi::environment env(argc, argv); mpi::communicator world; std::vector<int> ranks(1); // {0} mpi::group g = world.group(); // getting group from MPI_COMM_WORLD... g = g.include(ranks.begin(), ranks.end()); // ...and selecting only one (first) host from it /* --------------------------------------------------------------------------------- //sample 1: not work, inappropriate using MPI library calls //MPI_Comm_create (called from a mpi::communicator constructor) is a collective //operation and this function must be called on each host from parent communicator ("world" in this case) if (!world.rank()) { mpi::communicator myComm(world, g); } some_useful_function(); // <- we don't reach this place */ /* --------------------------------------------------------------------------------- //sample 2: still not working //remove a condition (but still using local variable scope) //we call MPI_Comm_create on each host, but, only one host create communicator, //each other get MPI_COMM_NULL { mpi::communicator myComm(world, g); } // <- at this place we have trouble with MPI_Comm_free(MPI_COMM_NULL) // because boost::mpi::communicator::comm_free don't check this */ /* --------------------------------------------------------------------------------- //sample 3: work, but with restriction //manually call MPI_Finalize before myComm destructor // */ mpi::communicator myComm(world, g); MPI::Finalize(); return 0; }
as i think, decision for this problem is a small fix for communicator.hpp (i'm using boost 1.45):
struct comm_free { void operator()(MPI_Comm* comm) const { int finalized; BOOST_MPI_CHECK_RESULT(MPI_Finalized, (&finalized)); if (!finalized && (MPI_Comm)comm != MPI_COMM_NULL) //fix here BOOST_MPI_CHECK_RESULT(MPI_Comm_free, (comm)); delete comm; } };
P.S. sorry for my english, it's not my first language :)
follow-up: 10 comment:8 by , 11 years ago
Replying to tapir2@…:
a little mistake, sorry... without type casting of course
if (!finalized && *comm != MPI_COMM_NULL) //fix here
comment:9 by , 11 years ago
A new communicator must always be created for all processes. The constructor calls MPI_Comm_create() which must be executed by all processes, even if they don't belong to the new group (see http://www.mpi-forum.org/docs/mpi-11-html/node102.html).
However I can confirm that I also have to call MPI_Finalize() myself (I'm using Boost 1.49.0). This call is only required though for those processes which don't belong to the newly created group. Boost.MPI behaves as if it skips calling MPI_Finalize() in the destructor of boost::mpi::environment for those processes.
comment:10 by , 10 years ago
comment:11 by , 10 years ago
Owner: | changed from | to
---|
comment:12 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
sample test