Boost C++ Libraries: Ticket #6830: make_shared slower than shared_ptr(new) on VC++9 and 10 https://svn.boost.org/trac10/ticket/6830 <p> I created a simple benchmark for measuing raw allocation throughput for 3 classes of different sizes with a common base class (constructors and destructors trivial). The number of allocations was set to 40,000,000 as it was roughly giving me 10 seconds running time per test. </p> <p> it turns out that on VC++9 (release target with default optimizations) boost::make_shared is significantly slower than simply doing boost::shared_ptr(new). Here's the benchmark output: </p> <p> <a class="missing wiki">TestBoostMakeShared</a> 10.577s 3.78179e+006 allocs/s </p> <p> <a class="missing wiki">TestBoostSharedPtrNew</a> 8.907s 4.49085e+006 allocs/s </p> <p> As you can see boost::make_shared is over 15% slower than boost::shared_ptr(new) idiom. </p> <p> One suggested solution: </p> <p> boost::shared_ptr doesn't have a way to retrieve the deleter without using RTTI which is what is slowing down the execution on VC++9/10. I decided to add one and use it from an alternative boost::make_shared. So I did the following: </p> <ol><li>I added a virtual function to detail::sp_counted_base (detail\sp_counted_base_w32.hpp): </li></ol><blockquote> <p> virtual void * get_raw_deleter( ) = 0; </p> </blockquote> <ol start="2"><li>I implemented get_raw_deleter() function in sp_counted_impl_p (detail\sp_counted_impl.hpp): </li></ol><blockquote> <p> virtual void * get_raw_deleter( ) { </p> <blockquote> <p> return 0; </p> </blockquote> <p> } </p> </blockquote> <ol start="3"><li>I implemented get_raw_deleter() function in sp_counted_impl_pd (detail\sp_counted_impl.hpp): </li></ol><blockquote> <p> virtual void * get_raw_deleter( ) { </p> <blockquote> <p> return &amp;reinterpret_cast&lt;char&amp;&gt;( del ); </p> </blockquote> <p> } </p> </blockquote> <ol start="4"><li>I implemented get_raw_deleter() function in sp_counted_impl_pda (detail\sp_counted_impl.hpp): </li></ol><blockquote> <p> virtual void * get_raw_deleter( ) { </p> <blockquote> <p> return &amp;reinterpret_cast&lt;char&amp;&gt;( d_ ); </p> </blockquote> <p> } </p> </blockquote> <ol start="5"><li>I added the following function to detail::shared_count: </li></ol><blockquote> <p> void * get_raw_deleter( ) const { </p> <blockquote> <p> return pi_? pi_-&gt;get_raw_deleter( ): 0; </p> </blockquote> <p> } </p> </blockquote> <ol start="6"><li>I added the following function to shared_ptr&lt;&gt;: </li></ol><blockquote> <p> void * _internal_get_raw_deleter( ) const { </p> <blockquote> <p> return pn.get_raw_deleter( ); </p> </blockquote> <p> } </p> </blockquote> <ol start="7"><li>I made a separate copy of boost::make_shared function and replaced a single line from: </li></ol><blockquote> <p> boost::detail::sp_ms_deleter&lt; T &gt; * pd = boost::get_deleter&lt; boost::detail::sp_ms_deleter&lt; T &gt; &gt;( pt ); </p> </blockquote> <p> to: </p> <blockquote> <p> boost::detail::sp_ms_deleter&lt; T &gt; * pd = static_cast&lt;boost::detail::sp_ms_deleter&lt; T &gt; *&gt;(pt._internal_get_raw_deleter()); </p> </blockquote> <p> Benchmarking the results afterwards gave me the following results on VC++9: </p> <p> <a class="missing wiki">TestBoostSharedPtrNew</a> 9.204s 4.34594e+006 allocs/s </p> <p> <a class="missing wiki">TestBoostMakeShared</a> 10.499s 3.80989e+006 allocs/s </p> <p> <a class="missing wiki">TestBoostMakeSharedAlt</a> 7.831s 5.1079e+006 allocs/s </p> <p> These changes translated into almost 35% improvement in allocation speed over the current implementation of boost::make_shared. Or to put it differently, they amount to 25+% decrease in running time as we could have supposed from the profiling results. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/6830 Trac 1.4.3 Dave Abrahams Thu, 26 Apr 2012 05:38:57 GMT <link>https://svn.boost.org/trac10/ticket/6830#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6830#comment:1</guid> <description> <p> I don't understand; <code>new</code> is a keyword, and <code>shared_ptr</code> is a class template, so <code>boost::shared_ptr(new)</code> is not legal C++. Could you please attach your benchmarks so we can reproduce your results? </p> </description> <category>Ticket</category> </item> <item> <author>ierceg@…</author> <pubDate>Thu, 26 Apr 2012 11:30:20 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6830#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6830#comment:2</guid> <description> <p> I meant "boost::shared_ptr(new)" as an idiom and not as literal code i.e. boost::shared_ptr(new int) vs. boost::make_shared&lt;int&gt;() </p> </description> <category>Ticket</category> </item> <item> <author>ierceg@…</author> <pubDate>Thu, 26 Apr 2012 16:10:54 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6830#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6830#comment:3</guid> <description> <p> This is a duplicate of <a class="new ticket" href="https://svn.boost.org/trac10/ticket/6829" title="#6829: Bugs: [smart_ptr] make_shared is slower than shared_ptr(new) in MSVC (new)">#6829</a> so I suggest we close it and move the discussion there especially since the solution proposed there is significantly more efficient. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Peter Dimov</dc:creator> <pubDate>Tue, 11 Dec 2012 18:21:33 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6830#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6830#comment:4</guid> <description> <p> (In <a class="changeset" href="https://svn.boost.org/trac10/changeset/81860" title="Change make_shared to use the new _internal_get_untyped_deleter. Refs ...">[81860]</a>) Change make_shared to use the new _internal_get_untyped_deleter. Refs <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/6830" title="#6830: Patches: make_shared slower than shared_ptr(new) on VC++9 and 10 (closed: fixed)">#6830</a>. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Peter Dimov</dc:creator> <pubDate>Tue, 11 Dec 2012 18:26:42 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/6830#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/6830#comment:5</guid> <description> <p> I was unable to reproduce the timing results, by the way. make_shared was always faster than shared_ptr(new), on both VC++8.0 and 10.0. I haven't tested with 9.0. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Peter Dimov</dc:creator> <pubDate>Thu, 13 Dec 2012 14:57:17 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/6830#comment:6 https://svn.boost.org/trac10/ticket/6830#comment:6 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> (In <a class="changeset" href="https://svn.boost.org/trac10/changeset/81899" title="Merged revision(s) 81860-81861 from trunk: Change make_shared to use ...">[81899]</a>) Merged revision(s) 81860-81861 from trunk: Change make_shared to use the new _internal_get_untyped_deleter. Fixes <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/6830" title="#6830: Patches: make_shared slower than shared_ptr(new) on VC++9 and 10 (closed: fixed)">#6830</a>. ........ Add allocate_shared_noinit. ........ </p> Ticket