Opened 6 years ago

Last modified 6 years ago

#12229 new Bugs

intrusive::unordered_set<T>::rehash() broken

Reported by: Christian Kaiser <boost@…> Owned by: Ion Gaztañaga
Milestone: To Be Determined Component: intrusive
Version: Boost 1.61.0 Severity: Problem
Keywords: Cc:

Description

I do use the intrusive::unordered_set<T> to store language-dependent objects, and when the language changes, I used to use

mymap::rehash()

to update the buckets. Notice the bucket count does not change!

As it is now (1.61), "fast_shrink" becomes true:

const bool fast_shrink = (!incremental) && (old_bucket_count >= new_bucket_count) &&

(power_2_buckets
(old_bucket_count % new_bucket_count) == 0);

while in the previouis version used, it was

const bool fast_shrink = (!incremental) && (old_bucket_count > new_bucket_count) &&

(power_2_buckets
(old_bucket_count % new_bucket_count) == 0);

and "fast_shrink" was false.

Due to

if(same_buffer && fast_shrink && (n < new_bucket_count)){

new_first_bucket_num = n; n = new_bucket_count;

}

the "n" is set to the bucket count, and the loop that is rehashing is NOT entered any more, thus rehash() does nothing any more.

This is according to the documentation, but a change in behaviour. So I add this as as a warning that the change might cause trouble for some people. A "rehash(bucket_size+1)" still does its job, though not as optimal as the size is then not a prime any more. An optional "force_rehash" parameter or such to the rehash() function might be handy.

Change History (2)

comment:1 by Ion Gaztañaga, 6 years ago

Previous behaviour was an unintended side effect. That behaviour was also broken if the hash was stored in the hook (as it would not be recalculated and stored). A new "full_rehash" function has been added in

https://github.com/boostorg/intrusive/commit/4546ffba1d8af0c97072456779a5f0d44067a43c

in order to achieve the desired "force rehash" behaviour. This function requires some invariants to be preserved (previous equal keys should produce equal hashes and previous uniqueness should be respected as otherwise rehashing will produce a broken container) so that only hashes are calculated.

Please check the proposed solution to see if it's useful when the language is changed and a new erasure+insertion would be more expensive than just rehashing.

comment:2 by boost@…, 6 years ago

Thank you very much for implementing/correcting. Sorry it took some time, but I can confirm that it works as intended. Great service!

Note: See TracTickets for help on using tickets.