Opened 14 years ago
Last modified 10 years ago
#2400 reopened Feature Requests
Messages corrupted if isend requests are not retained
Reported by: | Owned by: | Matthias Troyer | |
---|---|---|---|
Milestone: | Boost 1.37.0 | Component: | mpi |
Version: | Boost 1.36.0 | Severity: | Problem |
Keywords: | Cc: |
Description
If I do a series of isends and discard the request objects (because I don't need to know when they complete), the messages can get corrupted. I realize this is the way the MPI library is designed, but I'm wondering if it possible to use some C++ goodness to make the request objects persist behind-the-scenes until the request is completed? The behavior is particularly unexpected in the Python layer. Thanks very much!
Change History (8)
comment:1 by , 14 years ago
Component: | None → mpi |
---|---|
Owner: | set to |
comment:2 by , 13 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:3 by , 13 years ago
comment:4 by , 13 years ago
Resolution: | → invalid |
---|---|
Status: | assigned → closed |
comment:5 by , 13 years ago
This is not a bug but intended behavior. A note has been added to the documentation.
comment:6 by , 13 years ago
OK, I understand, but in Python one normally does not have to worry about when it is safe to deallocate an object, as the GC is supposed to take care of this for you. Is there no way to increment the Python reference count of the request object or its buffer until it is completed?
comment:7 by , 13 years ago
No, the basic issue is that: you *have to* call wait on the request to finish the irecv or isend operation and cannot just discard the object. If you do not call wait, the code never checks for completion. There are a few solutions, that are sub-optimal:
1.) keep the buffer alive as you propose, but that will lead to a memory leak since the request obejct is the only one who knows about the buffer. If you discard the request there will be a leak since nobody ever checks for completion.
2.) have the destructor call wait - but then the code might just hang or deadlock in the destructor, which would be very hard to debug.
3.) have the destructor cancel the request, but then again this leads to unexpected behavior, if one discards the request: this automatically cancels the send!
The best option as I can see is to:
4.) assert on the precondition to the destructor, namely that the request has to be finished. Note that we cannot throw an exception in a destructor have to abort.
I don't like option 4 either but it seems the safest and will at least tell you that you forgot to wait for completion.
comment:8 by , 10 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
We should revisit this ticket since MPI 3 now explicitly allows requests to be discarded and we should try to mimic this. This will require us to keep buffers alive and perform garbage collection from time to time when requests are done.
The reason is that when sending anything other than MPI datatypes, the object is serialized by Boost.Serialization into a temporary buffer, which is then associated with the request object. Using isend with Boost.MPI one always has to wait for completion of the request and cannot discard it.