Opened 16 years ago

Closed 13 years ago

#754 closed Bugs (fixed)

boost::any - typeid comparison across shared boundaries

Reported by: kulik Owned by: nasonov
Milestone: Component: any
Version: None Severity: Problem
Keywords: Cc:

Description (last modified by Marshall Clow)

typeid comparison (using == operator) fails under
certain platforms and certain conditions. For me it
fails for template classes at least (std::vector for
example) when I use it across shared boundaries.
Therefore boost::any doesn't work in these cases and
throws bad_any_cast exception even when it returns it
holds the same type I am about to cast it to (I am
using gcc-4.1).

boost::python has already solved this by comparing
string representations of types under problematic
platforms. This works well in all cases but can be a
bit slower.

In boost/python/type_id.hpp:
// for this compiler at least, cross-shared-library
type_info
// comparisons don't work, so use typeid(x).name()
instead. It's not
// yet clear what the best default strategy is.
# if (defined(__GNUC__) && __GNUC__ >= 3) \
 || defined(_AIX) \
 || (   defined(__sgi) && defined(__host_mips)) \
 || (defined(linux) && defined(__INTEL_COMPILER) &&
defined(__ICC))
#  define BOOST_PYTHON_TYPE_ID_NAME
# endif

I would say the same thing should be applied to boost::any

In boost/any.hpp:
template<typename ValueType>
ValueType * any_cast(any * operand)
{
    return operand && operand->type() == typeid(ValueType)
    ? &static_cast<any::holder<ValueType>
*>(operand->content)->held
                : 0;
    }

should be replaced with:

template<typename ValueType>
ValueType * any_cast(any * operand)
{
# if (defined(__GNUC__) && __GNUC__ >= 3) \
 || defined(_AIX) \
 || (   defined(__sgi) && defined(__host_mips)) \
 || (defined(linux) && defined(__INTEL_COMPILER) &&
defined(__ICC))
    return operand && !strcmp( operand->type().name(),
typeid(ValueType).name()
    ? &static_cast<any::holder<ValueType>
*>(operand->content)->held
                : 0;
# else
    return operand && operand->type() == typeid(ValueType)
    ? &static_cast<any::holder<ValueType>
*>(operand->content)->held
                : 0;
# endif
}

btw: I am aware that this may cause performance drops
and it would be great if there was a switch for that or
something for people that aren't using boost::any
across shared boundaries.

Change History (10)

comment:1 by kulik, 16 years ago

Logged In: YES 
user_id=1620918
Originator: YES

This is IMO a really critical bug and can be very easily solved. Many applications won't work if they use some kind of module mechanism with shared objects. I spend many hours smashing my head till I realised what's causing it.

comment:2 by alnsn, 16 years ago

Logged In: YES 
user_id=369903
Originator: NO

Since it's a bug in C++ implementation, I don't see why I should replace ASAP a portion of good C++ with non-portable code. I would rather suggest you first check if this bug is already in a queue of your C++ vendor and post a link here.
In a meantime, I will study boost.python workaround and other libraries. There ought to be portable type_info somewhere. Boost should not have multiple clones of this functionality. This is especially important for non-portable code.

comment:3 by kulik, 16 years ago

Logged In: YES 
user_id=1620918
Originator: YES

I know it's kinda a gcc specific problem, but if you watch Visual C++ output, it's technically doing the same, string comparison of the type representation. Gcc tries to be faster by comparing memory adressess of structures characterizing the types, these structures are unfortunately linked separately into each boundary, thus causing this problem when using boost::any across the boundary.

official gcc bug entry:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23628

There are various references of the problem on gcc.gnu.org mailing list:
http://gcc.gnu.org/ml/gcc/2002-05/msg01970.html
http://gcc.gnu.org/ml/gcc/2003-10/msg01360.html (unanswered)

However this bug prevails from somewhere around gcc-3 on. Before gcc-3 textual comparison was made to compare types (this is how many compilers do it nowadays).
PS: The same problem also exists for dynamic_cast, type comparison fails there too and casts are invalid across shared boundaries.

(shared boundary - for example .dll or .so, the problem surfaces when I boost::any_cast something from application in code from the shared object or vice versa)

I apologise for priority bumping this bug, it has really been irritating me for a long time.

PS: dynamically loading shared objects as RTLD_GLOBAL seems to solve this bug, but I consider this a workaround.

comment:4 by nasonov, 15 years ago

Owner: changed from alnsn to nasonov
Severity: Problem
Status: assignednew

comment:5 by Marshall Clow, 15 years ago

Component: Noneany
Description: modified (diff)

comment:6 by staffan.spam at gimaker se, 15 years ago

I was bit my this bug as well, and it really wasn't fun to track down.

Using RTLD_GLOBAL isn't possible when using libltdl to load the DSO, since lt_dlopenext() doesn't expose a way to control the flags used with dlopen.

The only fix for this is for us to maintain a patched version of any.hpp, that has the above workaround implemented - which seems like a hopelessly fragile solution.

It doesn't seem like the GCC crowd will *ever* fix this, so can you please consider implementing a workaround for it in Boost?

comment:7 by Mike Dickey, 15 years ago

<banging head against the wall>Just for the record: I lost a day and a half to this bug too... until I was finally able to figure out what seemed to be happening and prove to myself that it wasn't my code that was at fault. Tough one too, because it works fine on some platforms (Windows, OSX, etc.) but not others (i.e. Linux). I agree that this is a really nasty bug b/c it makes Boost's behavior non-deterministic across platforms, which I thought was part of the whole point... </banging head against the wall>

comment:8 by ulit at eikon dot com, 14 years ago

Same here as found by Mike and staffan: painful and long search why boost::any didn't work across library boundaries.

When the patch above is not acceptable: please consider an additional cast function (e.g. "any_cast_by_name") which does the type comparision based on strings.

in reply to:  6 comment:9 by anonymous, 13 years ago

Replying to staffan.spam at gimaker se:

I was bit my this bug as well, and it really wasn't fun to track down.

Using RTLD_GLOBAL isn't possible when using libltdl to load the DSO, since lt_dlopenext() doesn't expose a way to control the flags used with dlopen.

libltdl always uses RTLD_GLOBAL, assuming RTLD_GLOBAL is defined for your platform. See libltdl/ltdl.c in libtool.

comment:10 by nasonov, 13 years ago

Resolution: Nonefixed
Status: newclosed

(In [56168]) Fix #754 (boost::any - typeid comparison across shared boundaries).

Note: See TracTickets for help on using tickets.