Opened 9 years ago
#8831 new Feature Requests
Reuse capacity from user containers in order to prevent superfluous allocations
Reported by: | Owned by: | Marshall Clow | |
---|---|---|---|
Milestone: | To Be Determined | Component: | string_algo |
Version: | Boost 1.54.0 | Severity: | Optimization |
Keywords: | Cc: |
Description
Inside boost::algorithm::split, new variable of container type is created, and then swapped with result when fully filled.
See http://www.boost.org/doc/libs/1_54_0/boost/algorithm/string/iter_find.hpp , there is code like:
SequenceSequenceT Tmp(itBegin, itEnd); Result.swap(Tmp);
Maybe that was done in pursuit of strong exception safety guarantee - but I don't see much value for it in that case, because split is supposed to replace values in original container - https://svn.boost.org/trac/boost/ticket/5915 . I think basic guarantee would be enough.
Often SequenceSequenceT is container like std::vector, which already has capacity from previous usages, which can be reused avoiding costly allocations. For example:
Result.assign(itBegin, itEnd);
Maybe that would require stricter requirements on SequenceSequenceT, or maybe overload or traits specialization can be used for common things like std::vector and boost::container::vector or as customization point.
Here is proof-of-concept which avoids allocations showing speed difference: http://coliru.stacked-crooked.com/view?id=bf5dd9f2d9d20d61470e73a6b2940333-9a9914b3e2b7ed07c206d6accecccdb6
On my machine I have following results:
start end 0.85 s 64000000 start end 1.55 s 64000000
I.e. version with allocations is ~1.8x slower.
Maybe other algorithms have similar issues - I haven't checked.