Implement an optimization that David Abrahams and myself came up with,
where Boost.Function uses a bit in the vtable pointer to indicate when
the target function object has a trivial copy constructor, trivial
destructor, and fits within the small object buffer. In this case, we
just copy the bits of the function object rather than performing an
indirect call to the manager.
This results in a 60% speedup on a micro-benchmark that copies and
calls such function objects repeatedly.