Opened 11 years ago

Closed 11 years ago

#5735 closed Feature Requests (fixed)

proto should force functions to be inline

Reported by: Mathias Gaunard Owned by: Eric Niebler
Milestone: To Be Determined Component: proto
Version: Boost 1.49.0 Severity: Problem
Keywords: Cc:

Description

Some functions in Boost.Proto should really always be inlined; unfortunately, they don't always properly get inlined, sometimes due to compiler bugs (I can think of problems with a certain GCC version on 64-bit for example).

I suggest that Proto does what it can to force those functions to be inline (using attribute((always_inline)) on GCC and forceinline with MSVC). This will also have the effect of yielding warnings when a function can't be inlined, and at least with GCC, functions will be inlined even in debug mode.

Attachments (1)

proto_forceinline.tar.gz (17.9 KB ) - added by Mathias Gaunard 11 years ago.
Adds BOOST_FORCEINLINE in relevant parts of Boost.Proto

Download all attachments as: .zip

Change History (10)

comment:1 by Eric Niebler, 11 years ago

Resolution: invalid
Status: newclosed

This is none of Proto's business. If there is a compiler bug leading it to poorly inline, then it sould be fixed in the compiler, not worked around in Proto. Forcing inlining might actually degrade performance for compilers that have good inlining heuristics.

comment:2 by Mathias Gaunard, 11 years ago

Since Proto is used to build many performance sensitive libraries and applications, and there is no downside to implementing this feature, I don't understand why you're discarding this request so easily.

Proto is a library for others to build upon, each having their own uses, requirements, compilers and toolchains. No compiler is perfect, and inlining is certainly not a perfect feature and is very dependent of the flags used -- which themselves may have other negative effects. That means that you cannot rely on the compiler to do it reliably, especially when large codebases are involved. There are always inlining issues, different ones with each version of GCC for example, including recent and widely deployed ones.

Some Proto constructs make no sense to not be inlined, so it seems like a good idea to tell the compiler that yes, they should always be inlined, and if it cannot for some reason, it should emit an error. It is also important in order to get decent performance when running the code in debug mode or in modes that are not too aggressive with inlining, such as size-optimized builds, popular on embedded systems.

I believe doing this would only improve the quality of Proto and make people more eager to rely on it.

I think the following paper, in section 9, sums up the importance of inlining with expression templates pretty well: http://arxiv.org/pdf/1104.1729

comment:3 by Eric Niebler, 11 years ago

Wow, the paper is from this year, and they think that Blitz++ and uBLAS represent the bleeding edge of ET technology. That's laughable.

But the perf measurements are interesting. I can't see where they say what compiler they're using for their perf benchmarks, can you? If they're using that buggy GCC compler version, then yes, it will grossly screw up the results. They talk about cache issues, but only consider data size and not code size.

Joel Falcou is the one closest to this issue. He's filed specific bugs against Proto (function X should be changed in way Y for better inlining), which I have addressed promptly. He hasn't noticed any problems that would require any across-the-board change like the one you're suggesting. Or if he has, he hasn't said anything about it. If you can show me a specific case where a (non-buggy, mainstream) compiler generates slow Proto code because of failed inlining, I'd be more easily convinced.

But you should bring this up on the Proto list and make sure Joel sees it. I'd be curious to hear his thoughts.

comment:4 by Mathias Gaunard, 11 years ago

Type: BugsFeature Requests

Yes, that paper is "not terribly good", yet it has a couple of wise things in it, and the conclusion is interesting, which is why I put it there. (it's from Thomas Heller's advisor, by the way)
I unfortunately don't have more information on how they got those results.

I haven't personally run into problems with inlining and Boost.Proto, since I haven't done things that were very performance critical with it yet that required me to look at the produced asm, but I have run into issues in NT2 (in the non-ET parts) where some very small and trivial functions didn't get inlined on some GCC versions or settings, which was catastrophic to performance (hundreds of times slower).
Some other NT2 users ran into similar problems on MSVC, albeit on different functions.

I believe putting force inline attributes to functions is an even more reliable solution than making code "compiler-friendly", since this way we can have a guarantee that things will be inlined rather than analyzing asm and finding out a workaround on a case by case basis.

I'm only suggesting that a BOOST_PROTO_FORCE_INLINE be put instead of inline in the cases where it appears to be a good idea. Is that considered to be too intrusive in the source code?

In any case I'll get Joel in the loop.

in reply to:  3 comment:5 by anonymous, 11 years ago

Replying to eric_niebler:

Wow, the paper is from this year, and they think that Blitz++ and uBLAS represent the bleeding edge of ET technology. That's laughable.

Yeah quite sad ...

As for the problem at hand, I have mgaunard handy in my office, and as he said, we got some sporadic MSVC 10 with some size aggressive optimization. Now, I dont think we need an all-over-your-face approach but just have a force inline macro handy that me/you can apply where and when necessary.

comment:6 by Mathias Gaunard, 11 years ago

I suggest at least proto::value(), proto::child, make_expr and related transforms be marked force inline. Looking at assembly, it happens relatively often that boost::proto::value is not inlined, even though it definitely should always be.

boost/config.hpp now contains a BOOST_FORCEINLINE macro that can be used for this purpose.

by Mathias Gaunard, 11 years ago

Attachment: proto_forceinline.tar.gz added

Adds BOOST_FORCEINLINE in relevant parts of Boost.Proto

comment:7 by Mathias Gaunard, 11 years ago

The attached patch adds BOOST_FORCEINLINE to the following functions:

  • as_lvalue, ignore_unused, dont_care
  • as_child, as_expr
  • proto_base, address_of_hack
  • child, child_c, value
  • make_expr, unpack_expr
  • operators and operators defined by extends, extends constructor
  • built-in generators
  • the transform operator() overloads that give dummy state and data
  • call, make, when and pass_through transforms

This patch adds a significant number of BOOST_FORCEINLINE, but all of them make sense, because the functions in question are lightweight wrappers or only forward non-recursively to another function.

comment:8 by Eric Niebler, 11 years ago

Resolution: invalid
Status: closedreopened
Version: Boost 1.47.0Boost 1.49.0

This is fixed on trunk in [75578]. I'll merge this to release after the tests have cycles. Thanks, Mathias.

comment:9 by Eric Niebler, 11 years ago

Resolution: fixed
Status: reopenedclosed

(In [76026]) merge [75578] from trunk, fixes #5735

Note: See TracTickets for help on using tickets.