Opened 12 years ago

Closed 7 years ago

#4421 closed Patches (fixed)

fix for #4400

Reported by: Wolf Lammen <ookami1@…> Owned by: No-Maintainer
Milestone: Boost 1.44.0 Component: preprocessor
Version: Boost 1.44.0 Severity: Problem
Keywords: Cc:

Description

The attached patch file fixes the bug described in ticket #4400

Yet to be done: has somebody access to a compiler based on Edison Design Group C++ front end version 3.0.8 or earlier? I would like the code being compiled by such a compiler and see, whether the code fork for that compiler type runs properly.

The preprocessor code has not been touched in the past 7 years, and seemingly there is no active maintainer around, so I chose to keep the fixes impact on existing software is minimal as possible. I can tell from the sources that the author of the preprocessor code meant BOOST_PP_SEQ_REST_N to fulfil a certain undocumented condition, which it violates in a single case, so a good fix would care about said condition. But BOOST_PP_SEQ_REST_N is much more at the core of the preprocessor, and a change affects more code. So, instead, I made the only client BOOST_PP_SEQ_REPLACE relying on that condition when BOOST_PP_SEQ_REST_N fails to fulfil it, independent of BOOST_PP_SEQ_REST_N.

The code contains a compiler fork, that treats preprocessors with rescanning and/or argument substitution problems differently. I used the same technics I found elsewhere in boost code to work around that problems. I think I it is ok, and my (compliant) compiler does not complain, but, of course, there is a slight chance I got it wrong.

To verify the fix is correct I used a self written test suite, that checks the validity of a BOOST_PP_SEQ_REPLACE implementation. It uses a brute force method, i.e. sequences of all possible lengths are tested, each element is replaced, and each replacement is individually checked. The test suite allows for varying the input sequence, so sequences containing characters with special meaning to the preprocessor has been tested as well. The test suite is a UNIX bash shell script, which I attached here as well.

The test results show without exception so far,

  • that the fixed code produces exactly the same results than the old code, whereever the old code succeeds. This includes the distribution of white space characters in between sequence elements;
  • that the fixed code produces the correct result, where the old code fails;
  • that the same holds if the BOOST_PP_CONFIG_EDG flag is forced. However, the GNU preprocessor I used is standard compliant, so an old EDG based compiler might still fail.

As I changed the code in seq_replace.hpp significantly, and in a not trivial manner, I added my copyright notice.

Cheers

Wolf Lammen

Attachments (4)

seq_replace.patch (2.4 KB ) - added by Wolf Lammen <ookami1@…> 12 years ago.
testseq.sh (8.3 KB ) - added by Wolf Lammen <ookami1@…> 12 years ago.
test suite
seq_split.patch (3.5 KB ) - added by Wolf Lammen <ookami1@…> 12 years ago.
seq_first_rest_n.patch (3.7 KB ) - added by ookami1@… 12 years ago.

Download all attachments as: .zip

Change History (17)

by Wolf Lammen <ookami1@…>, 12 years ago

Attachment: seq_replace.patch added

by Wolf Lammen <ookami1@…>, 12 years ago

Attachment: testseq.sh added

test suite

comment:1 by Wolf Lammen <ookami1@…>, 12 years ago

The more I look into the preprocessor source code, the more I am convinced, Mr. Mensonides wanted his macros to be able to handle empty sequences. This is in contrast to how I read the preprocessor documentation:

"Sequences are data structures that merge the properties of both lists and tuples with the exception that a seq cannot be empty. Therefore, an 'empty' seq is considered a special case scenario that must be handled separately in C++."

If my assumption above is right, BOOST_PP_SEQ_REST_N works incorrect for 256 as its first argument.

So here is an alternative patch seq_split.patch, that fixes _both_ BOOST_PP_SEQ_REST_N and BOOST_PP_SEQ_REPLACE, and, to boot, simplifies code in seq/rest_n.cpp and seq/first_n.cpp.

IMHO, it should be preferred over my previous patch seq_replace.patch, that fixes BOOST_PP_SEQ_REPLACE only.

by Wolf Lammen <ookami1@…>, 12 years ago

Attachment: seq_split.patch added

comment:2 by Steven Watanabe, 12 years ago

Your patch prevents BOOST_PP_SEQ_REST_N from handling empty sequences. In the current C++ standard, empty sequences don't work because empty macro arguments are illegal, but IIRC, C99 and C++0x both allow empty macro arguments.

See http://chaos-pp.cvs.sourceforge.net/viewvc/chaos-pp/chaos-pp/chaos/preprocessor/seq/drop.h?hideattic=0&revision=1.3&view=markup for another implementation.

comment:3 by Steven Watanabe, 12 years ago

Never mind. I missed the changes to split.hpp. However, I'm not sure that n ## seq is legal if seq is empty.

comment:4 by Steven Watanabe, 12 years ago

Unfortunately, your patch adds an extraneous BOOST_PP_EMPTY in BOOST_PP_SEQ_SPLIT. Are you sure that this doesn't break anything else?

in reply to:  4 comment:5 by ookami1@…, 12 years ago

Replying to steven_watanabe:

Unfortunately, your patch adds an extraneous BOOST_PP_EMPTY in BOOST_PP_SEQ_SPLIT. Are you sure that this doesn't break anything else?

Hi Steven,

Grepping shows, that the only clients of BOOST_PP_SEQ_SPLIT within boost are BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N, and outside of boost SHOULD not exist a user, because BOOST_PP_SEQ_SPLIT, hidden in a detail directory, is not part of the preprocessor interface. And both clients, of course, have been adapted to the semantic change of BOOST_PP_SEQ_SPLIT.

The usage of BOOST_PP_EMPTY is an idiom required, because C90 does not guarantee proper handling of empty arguments to macros. So MACRO(seq BOOST_PP_EMPTY) is a conformant way to pass an empty sequence (or anything that might expand to nothing) as a non-empty argument to a macro. During expansion the padded BOOST_PP_EMPTY is easily popped off by appending () to the passed argument. This technique is found all over the preprocessor code.

BOOST_PP_SEQ_SPLIT has been tailored to follow this idiom suit: It expects a sequence padded with BOOST_PP_EMPTY, and it will return both the head and the tail of a split sequence padded again. This safe handling of empty heads or tails allowed the fixing BOOST_PP_SEQ_REPLACE (and, to boot, BOOST_PP_REST_N, BOOST_PP_FIRST_N)

cheers Wolf Lammen

in reply to:  3 comment:6 by ookami1@…, 12 years ago

Replying to steven_watanabe:

Never mind. I missed the changes to split.hpp. However, I'm not sure that n ## seq is legal if seq is empty.

You are right here, this will not work, when an empty sequence (actually it will not be exactly empty, but BOOST_PP_EMPTY instead) is submitted. Fortunately, you will never call BOOST_PP_SEQ_REPLACE with an empty sequence. And calls to BOOST_PP_REST_N or BOOST_PP_FIRST_N with empty sequences are considered illegal as well, according to the documentation. So the problem reduces to a possibly internal usage of the latter macros, exploiting the fact, that they internally appended a (nil) to sequences. If I remember right, I already checked on that condition, but I am a bit unsure right now, so I will do it again.

If it turns out to be a problem, the solution seems straightforward: For Microwerks (only):

#define BOOST_PP_SEQ_SPLIT_0_BOOST_PP_EMPTY BOOST_PP_EMPTY, BOOST_PP_EMPTY

cheers Wolf Lammen

comment:7 by ookami1@…, 12 years ago

The only indirect callers of BOOST_PP_SEQ_SPLIT within boost, but outside of preprocessor/seq, I could find, are macros in:

mpl/aux_/preprocessor/range.hpp (and their users)

parameter/preprocessor.hpp (and their users)

Only for some calls to BOOST_PP_SEQ_FIND_N in parameter/preprocessor.hpp I was unable to rule out they pass empty sequences, mostly, because I got lost in the hierarchies of layers found there. If (that is not for sure!) they really pass empty sequences to the preprocessor, then

  1. this violates the documented limitations of the preprocessor code;
  2. this is likely to fail on exotic compilers, because some do not support empty macro parameters at all.

So, what line is to follow here? Fortify BOOST_PP_SEQ_FIRST_N such that it tolerates abuses to the extent possible? Stay within the asserted limitations and let bugs surface in dependent code? Notify the maintainers of boost/parameter of possible conflicts? Start a discussion on the mailing list? Give up on supporting sequences of size 256 altogether and decrement BOOST_PP_LIMIT_SEQ (currently set to 256, but I know many macros (if not the majority) in preprocessor/seq do not support this limit)?

by ookami1@…, 12 years ago

Attachment: seq_first_rest_n.patch added

comment:8 by anonymous, 12 years ago

seq_first_rest_n.patch fixes BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N, so they both accept sequences of length 256 now. As a bonus feature, for preprocessors sufficiantly handling empty arguments (or those expanding to nothing), both macros process empty sequences as well. Usage of empty sequences is outside of the specs of the preprocessor code, and is not generally supported by the preprocessor library.

The previous patches did not handle empty sequences properly for DMC based preprocessors. This is corrected here, no other difference is introduced.

This patch supersedes all previous ones.

comment:9 by Edward Diener, 7 years ago

Fixed in the latest preprocessor code in the 'develop' branch.

While I appreciate the patches I did not use them. It is an undefined result to call BOOST_PP_SEQ_REST_N with an 'n' which is equal to the size of the 'seq'. The fact that this returns nothing is an implementation detail, since a seq cannot be empty and BOOST_PP_SEQ_REST_N clearly states that it expands to a seq. I will update the documentation accordingly. I am not willing to change BOOST_PP_SEQ_REST_N to accomodate a situation which should be undefined.

in reply to:  9 comment:10 by ookami1@…, 7 years ago

I do see your point, but let me comment on it nevertheless. This is not meant to talk you into applying this patches. It is fine with me, and actually I was quite surprised and pleased to see somebody care about this report at all after such a long time.

Usually, you would do parameter checking in order to enforce an API. This cannot really be done in preprocessor code, as this is very expensive and requires lots of macro replacements, unduely slowing down compiling.

So it is up to users to carefully use the preprocessor code. Apart from sloppy mindsets found all over the place, correct handling is not made easy. Who really sees at a glance that BOOST_PP_SEQ_REST_N must not be called with an index pointing to the last element. And are there other places with obscure restrictions? Then we have the exceptions (see the documentation for BOOST_PP_SEQ_NIL) which may fool people into thinking, they can do likewise somewhere else.

On top, it does not help that BOOST_PP_SEQ_REST_N in reality behaves as expected in the vast majority of (and practically all relevant) cases. So code appears to be correct, but it is not.

I proposed my patches because I felt it is more appropriate to be fault tolerant in such a situation.

cheers Wolf

Replying to eldiener:

Fixed in the latest preprocessor code in the 'develop' branch.

While I appreciate the patches I did not use them. It is an undefined result to call BOOST_PP_SEQ_REST_N with an 'n' which is equal to the size of the 'seq'. The fact that this returns nothing is an implementation detail, since a seq cannot be empty and BOOST_PP_SEQ_REST_N clearly states that it expands to a seq. I will update the documentation accordingly. I am not willing to change BOOST_PP_SEQ_REST_N to accomodate a situation which should be undefined.

comment:11 by Edward Diener, 7 years ago

It is usually possible to do a certain amount of parameter checking in Boost PP but there is a compile time expense to doing it. Adding more checking at a lower level slows down all macro code that uses that lower level macro, while doing more checking at a higher level only slows down that macro code calling the higher level macro. That was one decision on why I chose to only change BOOST_PP_SEQ_REPLACE.

Another decision is to minimize possible changes to already existing usage while correcting functionality. The less change to have something work according to documentation is often more preferred than having more changes.

I did update the documentation to BOOST_PP_SEQ_REST_N regarding the issue of using passing it an amount equal to or greater than the seq size.

My Boost VMD library deals with emptiness as a viable concept in the preprocessor, because the test for emptiness, which can never work perfectly, is much better using variadic macros than when not using variadic macros. But in Boost PP, whose supports both non-variadic and some variadic macro use, emptiness should never be depended on.

comment:12 by Edward Diener, 7 years ago

This general problem has been fixed in the Boost 1.59 release.

comment:13 by Edward Diener, 7 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.