Boost C++ Libraries: Ticket #6324: Flyweight performance improvement https://svn.boost.org/trac10/ticket/6324 <p> Hi, I am using flyweight for storing char_string, and when I do an erase one of the flyweight entry when there are a lot of entries, I am facing the performance issue, like ticket <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/4264" title="#4264: Bugs: boost multi_index hashed_unique erase takes linear time (closed: fixed)">#4264</a> mentioned, in multi_index hashed_unique erase in boost/flyweight/hashed_factory.hpp hashed_factory_class::erase(handle_type) . </p> <p> There are two reasons. </p> <ol><li>hashed_factory_class::erase should call erase with key, like </li></ol><p> cont.erase(cont.key_extractor()(*(cont.iterator_to(*h)))); instead of calling cont.erase(cont.iterator_to(*h)); as Change History of <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/4264" title="#4264: Bugs: boost multi_index hashed_unique erase takes linear time (closed: fixed)">#4264</a> mentions. </p> <ol start="2"><li>boost::multi_index hashed_unique erase takes linear time against the bucketsize, even if you use erase method with key, as ticket <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/4264" title="#4264: Bugs: boost multi_index hashed_unique erase takes linear time (closed: fixed)">#4264</a>'s Change History mentioned. It calculates begin() and do linear search the top of the node when erasing some node. </li></ol><p> These two behavior cause significant performance penalty when a lot of entries are stored in flyweight. </p> <p> I think most of users of multi_index hashed_unique are expecting not O(1) begin(), but O(1) erase(key). I would like the erase method of hashed_unique should perform O(1). </p> <p> Thanks, </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/6324 Trac 1.4.3 Joaquín M López Muñoz Fri, 13 Jan 2012 09:31:32 GMT <link>https://svn.boost.org/trac10/ticket/6324#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6324#comment:1</guid> <description> <p> The problem with erasing by key rather than by iterator is that the former can throw (when calculating the hash value of the key, for instance), which Boost.Flyweight can't afford. </p> <p> Have you tried using <a href="http://www.boost.org/libs/flyweight/doc/reference/factories.html#set_factory">set_factory</a>? Insertion is probably slower but deletion does not have the problems you describe. </p> </description> <category>Ticket</category> </item> <item> <author>k_onishi@…</author> <pubDate>Wed, 01 Feb 2012 06:23:31 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6324#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6324#comment:2</guid> <description> <p> Thanks for your comment. </p> <p> I understood the design policy of Boost.Flyweight. But I think you can avoid exception by catch clause in erase method of the factory. Once you catch the exception, then erasing by iterator. </p> <p> I had adapted set_factory and improve performance by using erase(key) in the erase of the factory before posting this ticket. To solve whole problem completely, I modified multi_index/hashed_index.hpp as well as using set_factory of flyweight. </p> <p> I think these problems always stem from iterator manipulation of multi_index (and unordered). Does anyone have any idea of work around in multi_index.hashed_index performance issues? </p> <p> My problem is, or I am worried about, the bad reputation of boost library. Some commercial projects are using boost::flyweight and boost::multi_index::hashed_unique expecting O(1) inserting, erasing, and finding entry, but we had faced on the performance problem of erasing both flyweight and multi_index at late test phase. </p> <p> Users cannot imagine erasing of hashed container is time-consuming. Documentation (tell us which method has performance issue) or work around or customization (like set_factory) may help my problems. </p> <p> Thanks, </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Joaquín M López Muñoz</dc:creator> <pubDate>Thu, 17 Oct 2013 09:22:11 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6324#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6324#comment:3</guid> <description> <p> Starting with Boost.1.56, hashed indices will use a new internal data structure allowing for O(1) erase(iterator). </p> <p> <a class="ext-link" href="https://svn.boost.org/trac/boost/changeset/86264"><span class="icon">​</span>https://svn.boost.org/trac/boost/changeset/86264</a> </p> <p> A preview can be obtained by just downloading from the current Boost trunk branch. Any feedback on the performance within real-life scenarios is greatly appreciated. Waiting to such feedback before closing the bug. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Joaquín M López Muñoz</dc:creator> <pubDate>Fri, 29 Nov 2013 19:59:50 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/6324#comment:4 https://svn.boost.org/trac10/ticket/6324#comment:4 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> Ticket