Opened 12 years ago

Closed 9 years ago

#4264 closed Bugs (fixed)

boost multi_index hashed_unique erase takes linear time

Reported by: anonymous Owned by: Joaquín M López Muñoz
Milestone: Component: multi_index
Version: Boost 1.41.0 Severity: Problem
Keywords: Cc:

Description

Hi,

We are using a multi_index with several hashed_unique indices. We make the number of bucket pretty large to avoid hash key clashes.

When we do an erase, the time it takes is linear to the number of buckets. What happens is that erase returns an iterator to the next item, which means that it has to search linearly through all the buckets until it find a bucket with an item in it. In our tests it takes several milliseconds for this to happen (we have 1000000 buckets). As the number of items in the multi_index decreases, the performance gets worse because it has to do a linear search further to find a non-empty bucket.

When the item from the first non-empty bucket is deleted, it also takes longer because the hashed index keeps a variable to know what the first non empty bucket is. when the first non-empty bucket becomes empty, it has to do another linear search to find the next non-empty bucket.

I would suggest to have erase return void, and not to keep track of the first non-empty bucket, so that erase will take the expected constant time. Iterating is expected to take linear time so if that is slow it is expected, but erase should be constant time.

Change History (17)

comment:1 by Joaquín M López Muñoz, 12 years ago

Hi,

The first problem you mention has been debated for a number of years in the C++ commitee (C++0x unordered associative containers have exactly the same problem) and that debate is probably not over yet:

http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#579

I still hope that the signature of erase will be changed so as to return void, and will update Boost.MultiIndex accordingly. In the meantime, if your hashed index is unique (no duplicates) or you don't mind erasing all equivalent elements at once, you can replace your statements of the form

c.erase(it);

with

c.erase(c.key_extractor()(*it));

that is, you erase based on the key of the element pointed to by it; this overload of erase does not return an iterator and thus has constant complexity.

As for the second problem with tracking the first non-empty bucket, I don't see an easy solution here, because begin is required to be constant. How big the impact of this particular issue is in your case (leaving aside the first problem)?

comment:2 by anonymous, 12 years ago

The problem with keeping track of the first non-empty bucket is just as big a problem, we use it for an ultra low latency application and we can't use any function that has linear time, even if it is only one in a few calls. we can't afford the risk that our erase() happens to be erasing the first non-empty bucket and causes a linear search no matter how infrequent it may happen.

Would it be possible to have a function call on the hashmap that I can explicitly call to disable this keeping-track-of-first-non-empty feature, with the understanding that begin() will no longer have constant time complexity? what would be the ramifications of begin() having linear time complexity?

comment:3 by Joaquín M López Muñoz, 12 years ago

The problem with keeping track of the first non-empty bucket is just as big a problem, we use it for an ultra low latency application and we can't use any function that has linear time, even if it is only one in a few calls. we can't afford the risk that our erase() happens to be erasing the first non-empty bucket and causes a linear search no matter how infrequent it may happen.

The possible linear search is amortized across different calls to erase, so in average the time taken is constant. Have you actually measured whether this is making an impact on your app?

Would it be possible to have a function call on the hashmap that I can explicitly call to disable this keeping-track-of-first-non-empty feature, with the understanding that begin() will no longer have constant time complexity?

There's no technical impediment to doing this, but I won't unless there's a compelling argument that the keeping-track-of-first-non-empty feature is having a significant performance penalty. Take into account other implementations of C++0x unordered associative containers also have this feature to ensure a constant complexity begin.

what would be the ramifications of begin() having linear time complexity?

It's a violation of the standard. begin is required to be constant (not amortized contant), hence we can't say cache the first non-empty bucket to make it amortized contant, and the keeping-track-of-first-non-empty feature has to be implemented on insertion and deletion time.

comment:4 by anonymous, 12 years ago

Hello

A possible solution to this bug is to keep an intrusive list of the nodes in the table, so begin() and the returned iterator are O(1) when using erase(iterator).

BTW, it is not amortized. If the bucket table is resized to be 2 million, and you constantly insert()/erase() a couple of elements. You consistently get the worse case scenario.

Also, since, the bucket vector never shrinks, the problem can also be reproduced by insert()ing 2 million objects, then erase()ing them. The erase()ing will take forever.

Thanks, Sasha

comment:5 by anonymous, 12 years ago

Hello again

I was reading gcc4.1.2's unordered_map's/set's hashtable, and it does not have this performance bug. gcc4.1.2's unordered_map's/set's hashtable's begin() is linear.

I have measured this performance problem many times. erase()ing/replace()ing/modify()ing a single node can take a millisecond on a bucket table of 2 million. I would consider that a significant performance penalty especially when one insert()s/erase()s only a few elements repeatedly in between insert()ing/erase()ing 1 million elements.

Thanks, Sasha

comment:6 by Joaquín M López Muñoz, 12 years ago

A possible solution to this bug is to keep an intrusive list of the nodes in the table, so begin() and the returned iterator are O(1) when using erase(iterator).

This would increase the size of nodes, which currently are kept to a minimum (one pointer per node).

BTW, it is not amortized. If the bucket table is resized to be 2 million, and you constantly insert()/erase() a couple of elements. You consistently get the worse case scenario.

Yes, there are cases where you don'te get the benefit amortization. Unordered containers give average case complexity and worst case complexity, I'd include the case you describe as a form of the latter. Remember these containers are tricky and don't guarantee performance.

Also, since, the bucket vector never shrinks, the problem can also be reproduced by insert()ing 2 million objects, then erase()ing them. The erase()ing will take forever.

This effect is not related to the keeping-track-of-first-non-empty feature, but to LWG issue #579, which is something different and I'm willing to correct as soon as a resolution is agreed within the C++ committee.

I was reading gcc4.1.2's unordered_map's/set's hashtable, and it does not have this performance bug. gcc4.1.2's unordered_map's/set's hashtable's begin() is linear.

Yes, you're right, I've taken a look myself and begin with the latest versions of libstdc++ unordered containers is linear. This is a bug with libstdc++, since the op is required to be constant. I'll file a bug report to gather feedback from libstdc++ developers.

comment:7 by Joaquín M López Muñoz, 12 years ago

Commited a bug report for libstdc++:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44480

comment:8 by anonymous, 12 years ago

Hello

I understand that you're waiting for clarity from the standard, so I'll drop all comparisons to gcc4.1.2's unordered_set/map and other implementations.

As feedback from a big fan and user of multi-index, I am willing to eat the increased size of the nodes when using an intrusive list for guaranteed performance, and I am pretty sure all of your others fans and users feel the same.

Thanks, Sasha

comment:9 by Joaquín M López Muñoz, 12 years ago

I understand that you're waiting for clarity from the standard, so I'll drop all comparisons to gcc4.1.2's unordered_set/map and other implementations.

There are definitely some dark corners in the specification, and it'd be great if some consensus could be reached.

As feedback from a big fan and user of multi-index, I am willing to eat the increased size of the nodes when using an intrusive list for guaranteed performance, and I am pretty sure all of your others fans and users feel the same.

Hashed indices have minimal memory penalty, and this is something I'd like to preserve at all costs (and I think some users would, too.) I'm actively tracking these problems and will act upon them as soon as C++0x provides a clarification.

comment:10 by anonymous, 12 years ago

Hello

How about using policy based programming to get the best of both worlds?

One possibility is to use template arguments that override the current default implementation, and make erase() O(1) and begin() O(bucket vector size).

Another possibility is to add a method like enable_fast_erase_slow_begin() that override the current default implementation, and make erase() O(1) and begin() O(bucket vector size).

Obviously, once policy based programming is used, it opens the door to several implementations that will satisfy all users of multi-index.

Thanks, Sasha

comment:11 by jzwinck@…, 12 years ago

Regarding the standard and erase returning void, see #3693 and #3966 which provide an interim solution (and links to more gory details).

comment:12 by Marshall Clow, 12 years ago

Can we move this bug off of "showstopper"?

comment:13 by jzwinck@…, 12 years ago

How about fixing it simply by adding a quick_erase(iterator) method as in #3966? Anything more complex than this seems overwrought to me, and there's clear precedent for moving forward with something--anything--that prevents disastrous runtime performance when erasing by iterator.

comment:14 by anonymous, 10 years ago

C++11 is now released, I don't think there is any more debate about the complexity guarantees of these routines, or their return values. Can multi-index hashed_index be updated to meet the standard's complexity guarantees? Specifically, the issues mentioned in this bug - a.erase(itr) is O(bucket_count) but it is specified to be O(1) average, O(a.size()) worst-case.

multi-index is an extremely useful library that I would like to use in production code, but I can't use the hashed_index while erase(itr) is O(bucket count). For a hash with a low load factor, O(bucket count) is very bad indeed.

It's unfortunate that the standard requires a larger data structure size in implementation. Perhaps the smaller memory hashtable can still be included with the multi-index which does not meet the standard (perhaps with a non-constant begin() method, and an erase(itr) that returns void), for those who would prefer a smaller memory footprint.

in reply to:  12 comment:15 by viboes, 9 years ago

Severity: ShowstopperProblem

Replying to marshall:

Can we move this bug off of "showstopper"?

+1.

comment:16 by Joaquín M López Muñoz, 9 years ago

Starting with Boost.1.56, hashed indices will use a new internal data structure allowing for O(1) erase(iterator).

https://svn.boost.org/trac/boost/changeset/86264

A preview can be obtained by just downloading from the current Boost trunk branch. Any feedback on the performance within real-life scenarios is greatly appreciated. Waiting to such feedback before closing the bug.

comment:17 by Joaquín M López Muñoz, 9 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.