Boost C++ Libraries: Ticket #7630: Range adaptors do not play nicely with range-based for loops https://svn.boost.org/trac10/ticket/7630 <p> Consider the following C++11 code, adapted from the Range documentation: </p> <pre class="wiki">std::vector&lt;int&gt; vec; for (int val : vec | boost::adaptors::reversed | boost::adaptors::uniqued) { // Do stuff with val } </pre><p> The behavior of this natural-seeming code is actually undefined, due to a dangling reference: per the C++ standard, it is equivalent to </p> <pre class="wiki">{ auto &amp;&amp; __range = (vec | boost::adaptors::reversed | boost::adaptors::uniqued); for ( auto __begin = __range.begin(), __end = __range.end(); __begin != __end; ++__begin ) { int val = *__begin; // Do stuff with val } } </pre><p> The problem is that the value returned by the subexpression <code>vec | boost::adaptors::reversed</code> is a temporary, so its lifetime ends at the end of the statement containing it, namely the declaration of <code>__range</code>. Thus, <code>__range</code> is left holding a dangling reference to the range it's adapting. </p> <p> The fix is for each range adaptor to use an rvalue-reference overload to detect whether the input range is a temporary, and if so, move it into the adaptor (i.e. with <code>std::move</code>) in order to extend its lifetime. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/7630 Trac 1.4.3 Jeffrey Yasskin <jyasskin@…> Fri, 02 Nov 2012 18:49:49 GMT cc set https://svn.boost.org/trac10/ticket/7630#comment:1 https://svn.boost.org/trac10/ticket/7630#comment:1 <ul> <li><strong>cc</strong> <span class="trac-author">jyasskin@…</span> added </li> </ul> Ticket roman.perepelitsa@… Mon, 05 Nov 2012 12:24:58 GMT <link>https://svn.boost.org/trac10/ticket/7630#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:2</guid> <description> <p> There is one problem with the proposed solution: return types for all range adaptors are documented. For example, both vec | reversed and and vector&lt;int&gt;() | reversed are specified to have type reversed_range&lt;vector&lt;int&gt;&gt;, although in the first case the adaptor should store just a reference to the vector and in the second case it should have the vector stored by value. This can be done by storing variant&lt;vector&lt;int&gt;*, vector&lt;int&gt;&gt; in reversed_range&lt;vector&lt;int&gt;&gt;, but it has space and performance overhead. </p> <p> If return types of range adaptors weren't specified, the problem would not exist because vec | reversed and and vector&lt;int&gt;() | reversed could have different types, each implemented with zero overhead. This is a cleaner design, because specific types of the adaptors aren't of much use and can always be deduced with decltype, auto or even erased. Unfortunately, this would be a breaking change. </p> </description> <category>Ticket</category> </item> <item> <author>roman.perepelitsa@…</author> <pubDate>Mon, 05 Nov 2012 12:26:19 GMT</pubDate> <title>cc changed; keywords set https://svn.boost.org/trac10/ticket/7630#comment:3 https://svn.boost.org/trac10/ticket/7630#comment:3 <ul> <li><strong>cc</strong> <span class="trac-author">roman.perepelitsa@…</span> added </li> <li><strong>keywords</strong> range added </li> </ul> Ticket Dean Michael Berris Tue, 30 Apr 2013 23:58:21 GMT cc changed https://svn.boost.org/trac10/ticket/7630#comment:4 https://svn.boost.org/trac10/ticket/7630#comment:4 <ul> <li><strong>cc</strong> <span class="trac-author">mikhailberis@…</span> added </li> </ul> Ticket gromer@… Wed, 01 May 2013 15:58:16 GMT <link>https://svn.boost.org/trac10/ticket/7630#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:5</guid> <description> <p> One point of clarification, since it confused even me when I reread this: <code>operator |</code> is left-associative, so the adaptor expression is equivalent to <code>((vec | boost::adaptors::reversed) | boost::adaptors::uniqued)</code>. The temporary returned by the whole expression gets bound to <code>__range</code>, and therefore has its lifetime extended to match <code>__range</code>, but the temporary returned by the <em>subexpression</em> <code>vec | boost::adaptors::reversed</code> is not bound to a reference in the scope of the loop, so its lifetime is not extended. </p> </description> <category>Ticket</category> </item> <item> <author>Jeffrey Yasskin <jyasskin@…></author> <pubDate>Mon, 20 May 2013 03:19:27 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:6</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:2" title="Comment 2">roman.perepelitsa@…</a>: </p> <blockquote class="citation"> <p> There is one problem with the proposed solution: return types for all range adaptors are documented. </p> </blockquote> <p> Even though the adaptor return types are documented, they're not accurate. For example, <a href="http://www.boost.org/doc/libs/1_53_0/libs/range/doc/html/range/reference/adaptors/reference/filtered.html">http://www.boost.org/doc/libs/1_53_0/libs/range/doc/html/range/reference/adaptors/reference/filtered.html</a> says it returns a "boost::filtered_range&lt;typeof(rng)&gt;" (what is "typeof"? :-P), but it actually returns filtered_range&lt;Pred, Rng&gt;. adjacent_filtered, filtered, copied, replaced_if, and transformed list the wrong return types. indexed, indirected, map_keys, map_values, replaced, reversed, sliced, type_erased, and uniqued list the right ones. strided and tokenized don't list a return type. </p> <p> The natural return type for the fixed adapters is to have, say, lvalue|reversed return a reversed_range&lt;Rng&amp;&gt; and rvalue|reversed return a reversed_range&lt;Rng&gt;, since that's what the natural function template deduction will produce. I could instead have lvalue|reversed return reversed_range&lt;Rng&gt; and rvalue|reversed return reversed_range&lt;Rng&amp;&amp;&gt; in order to maintain compatibility for the safe existing uses of these type names. </p> <p> I would prefer to use the natural types because then any utilities will be useful for non-Boost consumers. </p> <p> Thoughts? </p> </description> <category>Ticket</category> </item> <item> <author>Jeffrey Yasskin <jyasskin@…></author> <pubDate>Mon, 27 May 2013 17:40:25 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/7630 https://svn.boost.org/trac10/ticket/7630 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost-range.patch</span> </li> </ul> <p> Patch demonstrating the fix on boost::adaptors::filtered </p> Ticket Jeffrey Yasskin <jyasskin@…> Mon, 27 May 2013 19:48:24 GMT <link>https://svn.boost.org/trac10/ticket/7630#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:7</guid> <description> <p> Note that the attached patch is only the fix for the <code>filtered</code> adaptor. I expect to make any changes you want there, and then extend it to the rest of the adaptors after the design is right. </p> </description> <category>Ticket</category> </item> <item> <author>pbazant@…</author> <pubDate>Tue, 11 Jun 2013 13:04:58 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:8</guid> <description> <p> Is the original problem really a problem? The following links suggests otherwise: </p> <p> <a class="ext-link" href="http://stackoverflow.com/questions/9657708/c11-the-range-based-for-statement-range-init-lifetime"><span class="icon">​</span>http://stackoverflow.com/questions/9657708/c11-the-range-based-for-statement-range-init-lifetime</a> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Fri, 14 Jun 2013 15:58:58 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:9</guid> <description> <p> Yes, this really is a problem. See <a class="assigned ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:5" title="#7630: Bugs: Range adaptors do not play nicely with range-based for loops (assigned)">comment 5</a> above: the issue isn't the lifetime of the value of <code>((vec | boost::adaptors::reversed) | boost::adaptors::uniqued)</code>, it's the lifetime of the <em>subexpression</em> <code>(vec | boost::adaptors::reversed)</code>, which is not bound to a local reference, and so does not have its lifetime extended. </p> </description> <category>Ticket</category> </item> <item> <author>Igor Lubashev <ilubashe@…></author> <pubDate>Thu, 30 Jan 2014 16:58:13 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:10 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:10</guid> <description> <p> See <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/9578" title="#9578: Bugs: Adapters (map_keys, map_values) cause undefined behavior (segv, etc) ... (closed: duplicate)">#9578</a> for a related issue, related to BOOST_FOREACH and adaptors. </p> <p> I view <a class="ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:2" title="Comment 2">comment:2</a> regarding the documented types as a very minor issue. Since the application of adaptors to r-values has previously resulted in undefined behavior, changing the data type returned for the application of adaptors to r-values should not break almost any properly working code that is relying on this documentation. </p> <p> The only kind of working code that could rely on this would be something like <code>foo(map&lt;int,int&gt;() | map_keys)</code>. In such cases, I would expect that <code>foo</code> is actually declared as a template that automatically captures the range type, so no harm done. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Nathan Ridge</dc:creator> <pubDate>Sat, 01 Feb 2014 06:09:26 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:11 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:11</guid> <description> <p> I reviewed this patch over email (shortly after it was posted) and made some (minor) suggestions. I'm posting them here now so that if someone would like to pick up the work on this bug, they are able to: </p> <ul><li>In the constructor of the primary template of extend_temporary, we know that the constructor argument will be an rvalue, right? So, would it make sense to use std::move instead of std::forward? </li></ul><p> </p> <ul><li>Given that the primary template of extend_temporary will never be used in C++98 mode (i.e., with no rvalue references), why even define it? </li></ul><p> </p> <ul><li>Regarding the change to range_iterator to have it accept a reference to a range as an input type: </li></ul><p> </p> <blockquote> <blockquote> <p> I think has_range_iterator should be updated correspondingly (notice how it uses range_mutable_iterator and range_const_iterator directly right now). </p> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <p> While most range metafunctions go through range_iterator, and will therefore also start working on references to ranges, there is a partial specialization of range_size in boost/range/size_type.hpp that will break if its argument is a reference to a range. I think that should be updated too for consistency. </p> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <p> Since these changes can stand on their own, I think they make sense as an independent patch. In fact, such a patch would resolve <a class="ext-link" href="https://svn.boost.org/trac/boost/ticket/7885"><span class="icon">​</span>https://svn.boost.org/trac/boost/ticket/7885</a> . </p> </blockquote> </blockquote> <p> </p> <ul><li>Regarding the choice of range adaptor return type: I understand that </li></ul><p> </p> <blockquote> <blockquote> <p> adaptor&lt;Range&amp;&gt; for lvalue ranges and </p> </blockquote> </blockquote> <blockquote> <blockquote> <p> adaptor&lt;Range&gt; for rvalue ranges </p> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <p> is what falls naturally out of template argument deduction. However, would something that makes the lvalue/rvalueness of the wrapped ranges more explicit be more readable? For example, </p> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <blockquote> <p> adaptor&lt;Range&amp;&gt; for lvalue ranges and </p> </blockquote> </blockquote> </blockquote> <blockquote> <blockquote> <blockquote> <p> adaptor&lt;Range&amp;&amp;&gt; for rvalue ranges </p> </blockquote> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <p> or </p> </blockquote> </blockquote> <p> </p> <blockquote> <blockquote> <blockquote> <p> adaptor&lt;Range, lvalue_tag&gt; for lvalue ranges and </p> </blockquote> </blockquote> </blockquote> <blockquote> <blockquote> <blockquote> <p> adaptor&lt;Range, rvalue_tag&gt; for rvalue ranges? </p> </blockquote> </blockquote> </blockquote> <p> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Neil Groves</dc:creator> <pubDate>Wed, 26 Feb 2014 22:12:29 GMT</pubDate> <title>status changed https://svn.boost.org/trac10/ticket/7630#comment:12 https://svn.boost.org/trac10/ticket/7630#comment:12 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> Ticket Neil Groves Sun, 09 Mar 2014 01:06:56 GMT <link>https://svn.boost.org/trac10/ticket/7630#comment:13 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:13</guid> <description> <p> I don't think the provided example does suffer undefined behaviour. It doesn't matter that the subexpression lifetime is short since the subexpression was used as input to the outer expression. The outer expression holds iterators by value by copying them. It doesn't hold onto the source range as stated in the report. </p> <p> If one were to have a slightly different example with, for example, a vector returned by value from a function and then directly passed into the adaptors this would be undefined. I'm not convinced that this use case is worth the general penalty. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Nathan Ridge</dc:creator> <pubDate>Sun, 09 Mar 2014 02:52:24 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:14 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:14</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:13" title="Comment 13">neilgroves</a>: </p> <blockquote class="citation"> <p> I don't think the provided example does suffer undefined behaviour. It doesn't matter that the subexpression lifetime is short since the subexpression was used as input to the outer expression. The outer expression holds iterators by value by copying them. It doesn't hold onto the source range as stated in the report. </p> <p> If one were to have a slightly different example with, for example, a vector returned by value from a function and then directly passed into the adaptors this would be undefined. </p> </blockquote> <p> Correct. </p> <blockquote class="citation"> <p> I'm not convinced that this use case is worth the general penalty. </p> </blockquote> <p> I have run into this situation, and been bitten by the undefined behaviour, on a number of occasions. Move constructors are generally cheap, so I believe the general penalty for fixing this would be low. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Neil Groves</dc:creator> <pubDate>Sun, 09 Mar 2014 10:05:12 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:15 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:15</guid> <description> <p> Okay let me think about this some more. It is an interesting and challenging problem. For many use cases I have the move penalty is completely unacceptable due to cache effects. If I can be sure I can create a zero overhead solution I'll do it. Obviously if anyone else beats me to an answer I'll happily accept it. My interpretation of the current attached example is that it incurs a space and speed overhead even when not using rvalue references. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Neil Groves</dc:creator> <pubDate>Thu, 13 Mar 2014 15:32:44 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:16 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:16</guid> <description> <p> I think I have a solution. I'm starting prototypes and will attempt to apply this to some large codebases to ensure that backward compatibility is maximised. </p> <p> I believe we can alter the return types to have an additional template paramter or N template parameters (to be decided) that describe some additional range category decoration. These will be used to determine if the range should be held / moved. By using a default template paramter(s) for these new parameters we can maintain old behaviour for old code while allowing new code to work in the manner requested. I believe that this fits nicely with other new features that I intend to release. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Dean Michael Berris</dc:creator> <pubDate>Mon, 01 Sep 2014 15:50:29 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:17 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:17</guid> <description> <p> Is the fix ready? Can we test it somewhere? Or at least see what the test is so we can confirm whether it's actually fixed? </p> </description> <category>Ticket</category> </item> <item> <author>rob.desbois@…</author> <pubDate>Thu, 02 Jun 2016 11:33:48 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:18 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:18</guid> <description> <p> What's the current status of this? I have scoured Boost Trac and mailing lists recently and still feel uncertain whether the described pattern is generally safe. My typical use case is slightly different, though suffers from the same potentially-undefined behaviour: functions which return a range produced by piping a container through some adaptor chain. </p> <p> Looking through the code for some of these adaptors my best understanding is that it is safe based on Neil's <a class="ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:13" title="Comment 13">comment:13</a> regarding copying of underlying iterators. I don't feel comfortable with that though: what if the implementation changes?. </p> <p> Perhaps while there is no fix (if there still isn't) could a summary of the issue and some advice be given in the Boost.Range docs? Even if it simply says "Here's the possible issue, but the implementation all of Boost's range adaptors means this is never UB" that would suffice: this is too concerning an issue regarding a common usage pattern to have no official recommendation for IMHO. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Nathan Ridge</dc:creator> <pubDate>Thu, 02 Jun 2016 23:18:04 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/7630#comment:19 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/7630#comment:19</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/7630#comment:18" title="Comment 18">rob.desbois@…</a>: </p> <blockquote class="citation"> <p> What's the current status of this? </p> </blockquote> <p> My impression is that the C++ community's thinking about Ranges in C++ has evolved a lot in the past couple of years. I think the state-of-the-art Ranges library is no longer Boost.Range, but Eric Niebler's <a class="ext-link" href="https://github.com/ericniebler/range-v3"><span class="icon">​</span>Range-v3</a>, which is now standards-track (forming the basis of the <a class="ext-link" href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/n4569.pdf"><span class="icon">​</span>C++ Ranges TS</a>). </p> <p> Ranges-v3 does things a bit differently from Boost.Range, and solves this issue (among others). </p> <p> I'm not sure what the plan is for Boost.Range going forward. We could possibly borrow some ideas from Range-v3, such as its approach for solving this problem, and put them into Boost.Range. Or Range-v3 itself could be brought into Boost (not sure what Eric's plans are in that regard). </p> </description> <category>Ticket</category> </item> </channel> </rss>