Boost C++ Libraries: Ticket #5596: MPI: problem creating communicator https://svn.boost.org/trac10/ticket/5596 <p> Where I create a communicator from a group, the program utilizes the CPU fully, and the code doesn't create the communicator. I'm attaching a simple example. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5596 Trac 1.4.3 irek.szczesniak@… Mon, 06 Jun 2011 19:34:00 GMT attachment set https://svn.boost.org/trac10/ticket/5596 https://svn.boost.org/trac10/ticket/5596 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">test.cpp</span> </li> </ul> <p> sample test </p> Ticket anonymous Mon, 06 Jun 2011 19:36:03 GMT <link>https://svn.boost.org/trac10/ticket/5596#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:1</guid> <description> <p> Forgot to add that I'm using OpenMPI 1.3 on Debian 6. </p> </description> <category>Ticket</category> </item> <item> <author>dwsel@…</author> <pubDate>Tue, 07 Jun 2011 17:10:04 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:2</guid> <description> <p> Hello! </p> <p> I experience the same issue with OpenMPI 1.4.2, gcc 4.4.5 I'm still at very beginning of parallel programming, so I may be talking nonsense from time to time (if yes please correct me). I hope that discussion will attract some more professional users that will give better answer for this question. </p> <p> So far I consider 3 possibilities: </p> <ol><li>incompatibility between <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="nl">std</span><span class="p">:</span><span class="n">iterator</span> <span class="n">from</span> <span class="n">v</span><span class="p">.</span><span class="n">begin</span><span class="p">()</span> <span class="n">and</span> <span class="n">v</span><span class="p">.</span><span class="n">end</span><span class="p">()</span> </pre></div></div></div></li></ol><p> as input parameter for </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="k">template</span><span class="o">&lt;</span><span class="k">typename</span> <span class="n">InputIterator</span><span class="o">&gt;</span> <span class="n">group</span> <span class="n">include</span><span class="p">(</span><span class="n">InputIterator</span> <span class="n">first</span><span class="p">,</span> <span class="n">InputIterator</span> <span class="n">last</span><span class="p">);</span> </pre></div></div></div><p> can be false because for vector of given length </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="n">std</span><span class="o">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="kt">int</span><span class="o">&gt;</span> <span class="n">v</span><span class="p">(</span><span class="n">x</span><span class="p">);</span> <span class="n">g</span><span class="p">.</span><span class="n">size</span><span class="p">();</span> </pre></div></div></div><p> gives number equal to vector length = x </p> <ol start="2"><li>Some copy constructor/pointers issue </li></ol><p> This is pretty much blind suspect after reading (problem with posting link to article) </p> <ol start="3"><li>Wrong parentheses along with process number that performs the operation of group creation </li></ol><p> I have to investigate the thing further by looking inside specific implementations. I hope to answer again in next few days. </p> </description> <category>Ticket</category> </item> <item> <author>monika.cienkus@…</author> <pubDate>Mon, 13 Jun 2011 12:18:29 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:3</guid> <description> <p> It's strange but dynamically created communicator work fine. </p> <pre class="wiki">#include &lt;vector&gt; #include &lt;boost/mpi.hpp&gt; namespace mpi = boost::mpi; int main(int argc, char argv[]) { mpi::environment env(argc, argv); mpi::communicator world, c; std::vector&lt;int&gt; v(1); mpi::group wg = world.group(); mpi::group g = wg.include(v.begin(), v.end()); c = new mpi::communicator(world, g); if (!world.rank()){ std::cout &lt;&lt; "v.size : " &lt;&lt; v.size() &lt;&lt; std::endl; std::cout &lt;&lt; "wg.size : " &lt;&lt; wg.size() &lt;&lt; std::endl; std::cout &lt;&lt; "g.size : " &lt;&lt; g.size() &lt;&lt; std::endl; std::cout &lt;&lt; "c.size : " &lt;&lt; c-&gt;size() &lt;&lt; std::endl; } return 0; } </pre> </description> <category>Ticket</category> </item> <item> <author>monika.cienkus@…</author> <pubDate>Mon, 13 Jun 2011 13:32:13 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:4</guid> <description> <p> There should be: </p> <pre class="wiki"> mpi::communicator world, *c; </pre> </description> <category>Ticket</category> </item> <item> <author>dwsel <dwsel@…></author> <pubDate>Mon, 13 Jun 2011 19:31:56 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:5</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5596#comment:3" title="Comment 3">monika.cienkus@…</a>: </p> <blockquote class="citation"> <p> It's strange but dynamically created communicator work fine. </p> <pre class="wiki">#include &lt;vector&gt; #include &lt;boost/mpi.hpp&gt; namespace mpi = boost::mpi; int main(int argc, char argv[]) </pre></blockquote> <p> Hello! </p> <p> I have noticed as well you left out clause: </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">world</span><span class="p">.</span><span class="n">rank</span><span class="p">())</span> </pre></div></div></div><p> That could confirm my 3rd suspicion, but... more about my opinion below </p> <p> You missed * in: </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="kt">int</span> <span class="n">main</span><span class="p">(</span><span class="kt">int</span> <span class="n">argc</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span> <span class="n">argv</span><span class="p">[])</span> </pre></div></div></div><p> After that your code compiles well. </p> <p> Could you show me example use of the c communicator? I can't seem to get it to work so far. Something as simple as: </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="k">if</span> <span class="p">(</span><span class="n">c</span><span class="o">-&gt;</span><span class="n">rank</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> </pre></div></div></div><p> gives me clone_impl exception in addition to the same pionter exceptions I get by simply dropping clause: </p> <div class="wikipage" style="font-size: 80%"><div class="wiki-code"><div class="code"><pre><span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">world</span><span class="p">.</span><span class="n">rank</span><span class="p">())</span> </pre></div></div></div><p> in the example provided by the author of the ticket. It seems that using pointer to c is only delaying exceptions to the moment of the using it! </p> <p> I think what are we doing here is simply redefinition of c but done by different threads, that's why I believe dropping clause will not help at all, because assigning processes to the communicator should be done by single process. </p> <p> Please elaborate. </p> </description> <category>Ticket</category> </item> <item> <author>irek.szczesniak@…</author> <pubDate>Tue, 14 Jun 2011 08:58:01 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:6</guid> <description> <p> I tested your solution and it doesn't resolve the problem. The program still utilizes the CPU 100%, and doesn't finish. So creating the communicator dynamically doesn't make a difference. </p> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5596#comment:3" title="Comment 3">monika.cienkus@…</a>: </p> <blockquote class="citation"> <p> It's strange but dynamically created communicator work fine. </p> <pre class="wiki">#include &lt;vector&gt; #include &lt;boost/mpi.hpp&gt; namespace mpi = boost::mpi; int main(int argc, char argv[]) { mpi::environment env(argc, argv); mpi::communicator world, c; std::vector&lt;int&gt; v(1); mpi::group wg = world.group(); mpi::group g = wg.include(v.begin(), v.end()); c = new mpi::communicator(world, g); if (!world.rank()){ std::cout &lt;&lt; "v.size : " &lt;&lt; v.size() &lt;&lt; std::endl; std::cout &lt;&lt; "wg.size : " &lt;&lt; wg.size() &lt;&lt; std::endl; std::cout &lt;&lt; "g.size : " &lt;&lt; g.size() &lt;&lt; std::endl; std::cout &lt;&lt; "c.size : " &lt;&lt; c-&gt;size() &lt;&lt; std::endl; } return 0; } </pre></blockquote> </description> <category>Ticket</category> </item> <item> <author>tapir2@…</author> <pubDate>Mon, 08 Aug 2011 12:26:04 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:7</guid> <description> <pre class="wiki"> #include &lt;vector&gt; #include &lt;boost/mpi.hpp&gt; namespace mpi = boost::mpi; int main(int argc, char** argv) { mpi::environment env(argc, argv); mpi::communicator world; std::vector&lt;int&gt; ranks(1); // {0} mpi::group g = world.group(); // getting group from MPI_COMM_WORLD... g = g.include(ranks.begin(), ranks.end()); // ...and selecting only one (first) host from it /* --------------------------------------------------------------------------------- //sample 1: not work, inappropriate using MPI library calls //MPI_Comm_create (called from a mpi::communicator constructor) is a collective //operation and this function must be called on each host from parent communicator ("world" in this case) if (!world.rank()) { mpi::communicator myComm(world, g); } some_useful_function(); // &lt;- we don't reach this place */ /* --------------------------------------------------------------------------------- //sample 2: still not working //remove a condition (but still using local variable scope) //we call MPI_Comm_create on each host, but, only one host create communicator, //each other get MPI_COMM_NULL { mpi::communicator myComm(world, g); } // &lt;- at this place we have trouble with MPI_Comm_free(MPI_COMM_NULL) // because boost::mpi::communicator::comm_free don't check this */ /* --------------------------------------------------------------------------------- //sample 3: work, but with restriction //manually call MPI_Finalize before myComm destructor // */ mpi::communicator myComm(world, g); MPI::Finalize(); return 0; } </pre><p> as i think, decision for this problem is a small fix for communicator.hpp (i'm using boost 1.45): </p> <pre class="wiki"> struct comm_free { void operator()(MPI_Comm* comm) const { int finalized; BOOST_MPI_CHECK_RESULT(MPI_Finalized, (&amp;finalized)); if (!finalized &amp;&amp; (MPI_Comm)comm != MPI_COMM_NULL) //fix here BOOST_MPI_CHECK_RESULT(MPI_Comm_free, (comm)); delete comm; } }; </pre><p> P.S. sorry for my english, it's not my first language :) </p> </description> <category>Ticket</category> </item> <item> <author>tapir2@…</author> <pubDate>Mon, 08 Aug 2011 12:49:09 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:8</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5596#comment:7" title="Comment 7">tapir2@…</a>: </p> <p> a little mistake, sorry... without type casting of course </p> <blockquote class="citation"> <pre class="wiki"> if (!finalized &amp;&amp; *comm != MPI_COMM_NULL) //fix here </pre></blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>bschaeling</dc:creator> <pubDate>Sun, 15 Apr 2012 16:59:26 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:9</guid> <description> <p> A new communicator must always be created for all processes. The constructor calls MPI_Comm_create() which must be executed by all processes, even if they don't belong to the new group (see <a class="ext-link" href="http://www.mpi-forum.org/docs/mpi-11-html/node102.html"><span class="icon">​</span>http://www.mpi-forum.org/docs/mpi-11-html/node102.html</a>). </p> <p> However I can confirm that I also have to call MPI_Finalize() myself (I'm using Boost 1.49.0). This call is only required though for those processes which don't belong to the newly created group. Boost.MPI behaves as if it skips calling MPI_Finalize() in the destructor of boost::mpi::environment for those processes. </p> </description> <category>Ticket</category> </item> <item> <author>irek.szczesniak@…</author> <pubDate>Tue, 22 May 2012 14:24:40 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5596#comment:10 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5596#comment:10</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5596#comment:8" title="Comment 8">tapir2@…</a>: </p> <blockquote class="citation"> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5596#comment:7" title="Comment 7">tapir2@…</a>: </p> <p> a little mistake, sorry... without type casting of course </p> <blockquote class="citation"> <pre class="wiki"> if (!finalized &amp;&amp; *comm != MPI_COMM_NULL) //fix here </pre></blockquote> </blockquote> <p> I checked whether this code fixes the problem, and it doesn't. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Matthias Troyer</dc:creator> <pubDate>Tue, 01 Jan 2013 11:41:14 GMT</pubDate> <title>owner changed https://svn.boost.org/trac10/ticket/5596#comment:11 https://svn.boost.org/trac10/ticket/5596#comment:11 <ul> <li><strong>owner</strong> changed from <span class="trac-author">Douglas Gregor</span> to <span class="trac-author">Matthias Troyer</span> </li> </ul> Ticket Matthias Troyer Tue, 11 Jun 2013 08:30:41 GMT status changed; resolution set https://svn.boost.org/trac10/ticket/5596#comment:12 https://svn.boost.org/trac10/ticket/5596#comment:12 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> (In <a class="changeset" href="https://svn.boost.org/trac10/changeset/84739" title="Fixed #6436 #5596 and added threaded initialization">[84739]</a>) Fixed <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/6436" title="#6436: Patches: Exception on destruction of communicator mpi_comm_null (closed: fixed)">#6436</a> <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/5596" title="#5596: Bugs: MPI: problem creating communicator (closed: fixed)">#5596</a> and added threaded initialization </p> Ticket