Boost C++ Libraries: Ticket #1670: wave almost unusably slow on real-world input https://svn.boost.org/trac10/ticket/1670 <p> I'm running a wave-based preprocessor on boost/python.hpp as part of document generation (Synopsis). The preprocessing alone takes &gt;6 hours ! (The subsequent parsing with Synopsis' own C++ parser is then relatively quick, surprisingly. </p> <p> I notice that wave seems to get increasingly slow, so I wonder whether the speed decrease is caused by some internal lookup on a growing dictionary (map) that has super-linear lookup time ? </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/1670 Trac 1.4.3 Hartmut Kaiser Tue, 04 Mar 2008 17:53:06 GMT <link>https://svn.boost.org/trac10/ticket/1670#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:1</guid> <description> <p> This apparently is not a Wave problem. I tried to preprocess python.hpp using the Wave commandline tool and it needed about 35 seconds to fully preprocess this file (full optimization) and about 4 minutes with optimizations off. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Tue, 04 Mar 2008 17:53:39 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/1670#comment:2 https://svn.boost.org/trac10/ticket/1670#comment:2 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">worksforme</span> </li> </ul> Ticket Stefan Seefeld Tue, 04 Mar 2008 17:57:26 GMT <link>https://svn.boost.org/trac10/ticket/1670#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:3</guid> <description> <p> Thanks for the quick reply. I'll profile with my own trace policy to see where it spends all this time. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Stefan Seefeld</dc:creator> <pubDate>Wed, 05 Mar 2008 04:31:19 GMT</pubDate> <title>status changed; resolution deleted https://svn.boost.org/trac10/ticket/1670#comment:4 https://svn.boost.org/trac10/ticket/1670#comment:4 <ul> <li><strong>status</strong> <span class="trac-field-old">closed</span> → <span class="trac-field-new">reopened</span> </li> <li><strong>resolution</strong> <span class="trac-field-deleted">worksforme</span> </li> </ul> <p> What were the flags you preprocessed this file with ? Which compiler were you emulating ? I noticed it makes quite a difference what macros are predefined, presumably because certain chunks of (boost) code are only enabled with the appropriate macro definitions. It seems entirely possible that large chunks of boost code are simply skipped as the presumed compiler represented by wave was not detected as 'supported'. </p> <p> Please find attached two python scripts that help you run wave in 'compiler emulation' mode. (Please make some obvious adjustments to the code to make it work for you.) </p> <p> I run it as 'python wave.py -S /usr/include/python2.5 -S boost boost/boost/python.hpp'. </p> <p> calling the wave applet without any compiler emulation flags takes indeed only a couple of seconds. However, using flags to emulate GCC (on my Fedora 8 laptop) makes it run much longer (in fact, it is still running, so I can't tell you the total time.) (I notably see wave spend quite some time in boost's funny preprocessor loop construct, which is used in different places.) </p> <p> I'd be happy to contribute this code (with some suitable changes) to the boost.wave project. I'm sure others may find it useful, too. </p> <p> Here is how it works: </p> <p> The wave.py file calls Emulator.get_compiler_info('C++', 'c++') to find compiler-specific (system) search paths as well as predefined macros. (The Emulator module will use some heuristics to query those flags for compilers it knows, such as GCC and cl. Those data then get cached under ~/.synopsis...) </p> Ticket Stefan Seefeld Wed, 05 Mar 2008 04:33:12 GMT attachment set https://svn.boost.org/trac10/ticket/1670 https://svn.boost.org/trac10/ticket/1670 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">wave.py</span> </li> </ul> Ticket Stefan Seefeld Wed, 05 Mar 2008 04:33:45 GMT attachment set https://svn.boost.org/trac10/ticket/1670 https://svn.boost.org/trac10/ticket/1670 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">Emulator.py</span> </li> </ul> Ticket Ainsley Pereira <boostbug@…> Wed, 04 Jun 2008 17:45:08 GMT <link>https://svn.boost.org/trac10/ticket/1670#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:5</guid> <description> <p> I did some profiling on the wave preprocessor in tools, and discovered that over 71% of its time was spent getting thread specific storage for code in spirit/phoenix/closures.hpp. (Another 4% was spent setting tss data.) It seems to be possible to define BOOST_WAVE_THREADING_SUPPORT to 0, but my bjam skills failed me and I was unable to build/link it with that to retest. I hope the information is useful to someone who knows more about the spirit closures stuff. </p> </description> <category>Ticket</category> </item> <item> <author>jordi@…</author> <pubDate>Thu, 20 Nov 2008 08:52:18 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:6</guid> <description> <blockquote> <p> Maybe our experience can bring some useful information to this bug, and it could locate some critical issue maybe in another boost library: </p> </blockquote> <blockquote> <p> We met a similar problem: in our program we process around a 1000 files with Wave and custom callbacks and it was taking much more than expected. We profiled and found out that it spent 95% of time traversing the thread-specific pointer list, in find_tss_data. </p> </blockquote> <blockquote> <p> A more restricted test, compiling 6 batches with 30 times the same file was surprisingly giving higher time on each successive batch. We checked that we were not leaking any resources, and we are not doing strange things. We disabled threading support in Wave and that fixed the problem, making it the same time for each batch as it should. The real world case moved from 40 minutes to 4 in some cases. </p> </blockquote> <blockquote> <p> My conclusion is that something is not well managed with the thread specific pointers either in WAVE, or in SPIRIT or in BOOST::THREAD itself. This could be potentially a critical bug if that is the case, and should be scaled down to the suitable libraries. </p> </blockquote> <blockquote> <p> By the way, to disable threading I added this at the top of wave_config.hpp, which is probably not a good idea: </p> </blockquote> <p> #define BOOST_WAVE_SUPPORT_THREADING 0 </p> <blockquote> <p> Platform is WinXP. Verified with both 1.36.0 and 1.37.0 versions. </p> </blockquote> <blockquote> <p> I hope this helps and thanks again for this great library. </p> </blockquote> <p> jordi </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Thu, 20 Nov 2008 13:29:08 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:7</guid> <description> <p> Thanks for this information. The suspicion this has something to do with the threading support (in Phoenix and Spirit) is not new. The best way to handle this (for now) is apparently to disable threading support for Wave (as long as you don't really need it, and most users won't) the way you did: #define BOOST_WAVE_SUPPORT_THREADING 0. </p> <p> Regards Hartmut </p> <p> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 20 Nov 2008 13:36:22 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:8</guid> <description> <p> Is this a flag that needs to be set prior to compiling boost, or is it enough to issue it in user code (prior to including boost.wave headers) ? </p> <p> Thanks, </p> <blockquote> <p> Stefan </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Thu, 20 Nov 2008 14:26:03 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:9</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/1670#comment:8" title="Comment 8">anonymous</a>: </p> <blockquote class="citation"> <p> Is this a flag that needs to be set prior to compiling boost, or is it enough to issue it in user code (prior to including boost.wave headers) ? </p> </blockquote> <p> Actually, both. You need to specify consistent setting while compiling the Wave libraries and the application using Wave. Either directly change the wave_config.hpp or specify the settings on the command line/bjam. </p> <p> Regards Hartmut </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 20 Nov 2008 14:32:51 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:10 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:10</guid> <description> <p> Ah, that's too bad. While I can certainly do that for experimental purposes, I don't expect my (Synopsis') users to do it; they are simply expected to have boost installed (most likely as a system package, such as from Fedora or debian). </p> <p> I don't know the boost.wave design well enough to be able to suggest anything at this point, but I certainly hope that there are (will be) ways for users only to 'pay what they need'. </p> <p> Thanks, </p> <blockquote> <p> Stefan </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Thu, 20 Nov 2008 17:31:41 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:11 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:11</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/1670#comment:10" title="Comment 10">anonymous</a>: </p> <blockquote class="citation"> <p> Ah, that's too bad. While I can certainly do that for experimental purposes, I don't expect my (Synopsis') users to do it; they are simply expected to have boost installed (most likely as a system package, such as from Fedora or debian). </p> </blockquote> <blockquote class="citation"> <p> I don't know the boost.wave design well enough to be able to suggest anything at this point, but I certainly hope that there are (will be) ways for users only to 'pay what they need'. </p> </blockquote> <p> Actually, by default everything depends on your project settings. If you enable threading for your application, Wave picks that up too and will be built with threading enabled. If the application settings don't include threading, you'll get the Wave library with threading disabled. Really, I see no way to make that more flexible by default, do you? </p> <p> Regards Hartmut </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 20 Nov 2008 17:40:55 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:12 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:12</guid> <description> <p> I'm not sure what you mean by 'project settings' (I can guess, though ;-) ). </p> <p> When I'm distinguishing between library and application compile-time I mean this: When setting the above macro to a specific value, do I need to recompile boost.wave itself, or is it enough to compile my own code ? Are the libboost_wave libraries dependent on this setting ? (You seemed to suggest the answer to this question is 'yes', while I'm hoping for a 'no'.) Or may be libboost_wave_mt uses TLS, while libboost_wave does not ? </p> <p> Thanks, </p> <blockquote> <p> Stefan </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Thu, 20 Nov 2008 18:24:29 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:13 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:13</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/1670#comment:12" title="Comment 12">anonymous</a>: </p> <blockquote class="citation"> <p> I'm not sure what you mean by 'project settings' (I can guess, though ;-) ). </p> </blockquote> <p> I meant the set of command line parameters used for compiling your application. </p> <blockquote class="citation"> <p> When I'm distinguishing between library and application compile-time I mean this: When setting the above macro to a specific value, do I need to recompile boost.wave itself, or is it enough to compile my own code ? Are the libboost_wave libraries dependent on this setting ? (You seemed to suggest the answer to this question is 'yes', while </p> </blockquote> <p> The settings used for the application should match the settings used for compiling the libraries. </p> <blockquote class="citation"> <p> I'm hoping for a 'no'.) Or may be libboost_wave_mt uses TLS, while libboost_wave does not ? </p> </blockquote> <p> Yes that describes it best. </p> <p> Regards Hartmut </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 20 Nov 2008 18:57:46 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:14 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:14</guid> <description> <p> Does boost provide a facility to query those settings ? Are the relevant settings documented somewhere ? </p> <p> Also, not all macros affect the compiled library itself, IIUC, so I believe it's important to make that clear. I'm for example thinking of BOOST_WAVE_USE_DEPRECIATED_PREPROCESSING_HOOKS. </p> <p> Thanks, </p> <blockquote> <p> Stefan </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Hartmut Kaiser</dc:creator> <pubDate>Thu, 20 Nov 2008 21:25:01 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/1670#comment:15 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/1670#comment:15</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/1670#comment:14" title="Comment 14">anonymous</a>: </p> <blockquote class="citation"> <p> Does boost provide a facility to query those settings ? Are the relevant settings documented somewhere ? </p> </blockquote> <p> Which settings? Whether threading is enabled or not? That's BOOST_HAS_THREADS, IIRC. </p> <blockquote class="citation"> <p> Also, not all macros affect the compiled library itself, IIUC, so I believe it's important to make that clear. I'm for example thinking of BOOST_WAVE_USE_DEPRECIATED_PREPROCESSING_HOOKS. </p> </blockquote> <p> Ok, will try to come up with a list of settings required to match. </p> <p> On a second thought: you always can use the non-threaded Wave library as long as your application defines BOOST_WAVE_SUPPORT_THREADING=0. But you'll have to make sure not use Wave from different threads at the same time. </p> <p> Regards Hartmut </p> </description> <category>Ticket</category> </item> </channel> </rss>