Opened 12 years ago

Closed 7 years ago

#4400 closed Bugs (fixed)

BOOST_PP_SEQ_REPLACE fails in corner cases

Reported by: Wolf Lammen <ookami1@…> Owned by: No-Maintainer
Milestone: Boost 1.44.0 Component: preprocessor
Version: Boost 1.44.0 Severity: Problem
Keywords: Cc:

Description

The following code replaces individual elements (first the 255th, then the 256th) in a given preprocessor sequence of length 256. I noticed that sometimes the element is not replaced, but inserted instead, yielding a longer sequence.

Using Boost version 1.4.3, the following code will not compile


/*
 * booster.cpp
 *
 *  Created on: 25.06.2010
 *      Author: Wolf Lammen
 */

#include "boost/preprocessor.hpp"

#define SEQ_256 \
	(1)(2)(3)(4)(5)(6)(7)(8)(9) \
	(10)(11)(12)(13)(14)(15)(16)(17)(18)(19) \
	(20)(21)(22)(23)(24)(25)(26)(27)(28)(29) \
	(30)(31)(32)(33)(34)(35)(36)(37)(38)(39) \
	(40)(41)(42)(43)(44)(45)(46)(47)(48)(49) \
	(50)(51)(52)(53)(54)(55)(56)(57)(58)(59) \
	(60)(61)(62)(63)(64)(65)(66)(67)(68)(69) \
	(70)(71)(72)(73)(74)(75)(76)(77)(78)(79) \
	(80)(81)(82)(83)(84)(85)(86)(87)(88)(89) \
	(90)(91)(92)(93)(94)(95)(96)(97)(98)(99) \
	(100)(101)(102)(103)(104)(105)(106)(107)(108)(109) \
	(110)(111)(112)(113)(114)(115)(116)(117)(118)(119) \
	(120)(121)(122)(123)(124)(125)(126)(127)(128)(129) \
	(130)(131)(132)(133)(134)(135)(136)(137)(138)(139) \
	(140)(141)(142)(143)(144)(145)(146)(147)(148)(149) \
	(150)(151)(152)(153)(154)(155)(156)(157)(158)(159) \
	(160)(161)(162)(163)(164)(165)(166)(167)(168)(169) \
	(170)(171)(172)(173)(174)(175)(176)(177)(178)(179) \
	(180)(181)(182)(183)(184)(185)(186)(187)(188)(189) \
	(190)(191)(192)(193)(194)(195)(196)(197)(198)(199) \
	(200)(201)(202)(203)(204)(205)(206)(207)(208)(209) \
	(210)(211)(212)(213)(214)(215)(216)(217)(218)(219) \
	(220)(221)(222)(223)(224)(225)(226)(227)(228)(229) \
	(230)(231)(232)(233)(234)(235)(236)(237)(238)(239) \
	(240)(241)(242)(243)(244)(245)(246)(247)(248)(249) \
	(250)(251)(252)(253)(254)(255)(256)

#if BOOST_PP_SEQ_SIZE(BOOST_PP_SEQ_REPLACE(SEQ_256, 254, a)) != 256
# error "replacement at 254: size of sequence has changed"
#endif
#if BOOST_PP_SEQ_SIZE(BOOST_PP_SEQ_REPLACE(SEQ_256, 255, a)) != 256
# error "replacement at 255: size of sequence has changed"
#endif

int main()
{
	return 0;
}

Here is the error message:

Invoking: GCC C++ Compiler g++ -I"/home/wolf/workspace/booster" -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP -MF"booster.d" -MT"booster.d" -o"booster.o" "../booster.cpp" ../booster.cpp:42:3: error: #error "replacement at 255: size of sequence has changed"

Note that the first call to BOOST_PP_SEQ_REPLACE succeeded.

I use the GNU Compiler collection version (Ubuntu 4.4.1-4ubuntu9) 4.4.1 on x86_64 GNU/Linux

=================== ANALYSIS OF THE BUG ======================

According to the documentation, valid ranges of lengths of boost preprocessor sequences are from 1 to BOOST_PP_LIMIT_SEQ (=256). Especially empty sequences are not allowed. This overall restriction imposes limits to parameters of macros related to sequences. In particular, you must not call

BOOST_PP_SEQ_REST_N(i, seq)

with i == BOOST_PP_SEQ_SIZE(seq), as this would force BOOST_PP_REST_N to return an empty sequence. BOOST_PP_SEQ_REPLACE does use BOOST_PP_REST_N in this way, thus relying on undefined behavior. Here is how BOOST_PP_SEQ_REPLACE is defined:

#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_N(BOOST_PP_INC(i), seq)

If i points to the last element of the sequence, it equals (size(seq) - 1) and BOOST_PP_INC(i) yields size(seq), an invalid argument to BOOST_PP_REST_N. I called BOOST_PP_REST_N with an offset equal to size(seq) for various sequences, and even though this was outside of the specifications, it expanded to an empty string - most of the time. An empty string would have been fine for BOOST_PP_SEQ_REPLACE, but unfortunately, BOOST_PP_REST_N failed to do so for sequences of maximum size. It internally increments its first parameter, and since incrementation saturates at 256, the algorithm delivers a different result then.

========================= FIX ======================

Here is a suggestion on how to fix this:

Extend the domain of BOOST_PP_REST_N and allow the first parameter to be equal to the size of the sequence. Make the macro reliably expand to an empty string then:

#define BOOST_PP_SEQ_REST_N(n, seq) BOOST_PP_IF(n, BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(BOOST_PP_DEC(n)), seq BOOST_PP_EMPTY))(), seq)

This is quick and dirty, and not optimal, because BOOST_PP_TUPLE_ELEM is always evaluated, but according to my tests it works. As I am not an experienced macro programmer, I would like to leave the details to your experts.

With kind regards Wolf Lammen

Change History (16)

comment:1 by Wolf Lammen <ookami1@…>, 12 years ago

After fiddling with the code a bit, I suggest, as a safe and quick fix, a replacement of

boost/preprocessor/seq/replace.hpp

by

******
#  *                                                                          *
#  *     (C) Copyright Paul Mensonides 2002.
#  *     Distributed under the Boost Software License, Version 1.0. (See
#  *     accompanying file LICENSE_1_0.txt or copy at
#  *     http://www.boost.org/LICENSE_1_0.txt)
#  *                                                                          *
#  ************************************************************************** */
#
# /* See http://www.boost.org for most recent version. */
#
# ifndef BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
# define BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
#
# include <boost/preprocessor/arithmetic/inc.hpp>
# include <boost/preprocessor/config/config.hpp>
# include <boost/preprocessor/seq/first_n.hpp>
#
# /* BOOST_PP_SEQ_REPLACE */
#
# if ~BOOST_PP_CONFIG_FLAGS() & BOOST_PP_CONFIG_EDG()
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_NP1(i, seq)
#    define BOOST_PP_SEQ_REST_NP1(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# else
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_REPLACE_I(seq, i, elem)
#    define BOOST_PP_SEQ_REPLACE_I(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_NP1(i, seq)
#    define BOOST_PP_SEQ_REST_NP1(i, seq) BOOST_PP_SEQ_REST_NP1_I(n, seq)
#    define BOOST_PP_SEQ_REST_NP1_I(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# endif
#
# endif

The idea behind this rework is to use a dedicated version of BOOST_PP_REST_N, called BOOST_PP_REST_NP1 here (NP1 stands for N + 1), and not rely on undocumented behavor of BOOST_PP_REST_N any more.

Since the former code called BOOST_PP_REST_N with its first argument at least 1, the replacement code can be simplified a little. The sentinel element (nil) in the replacement code of BOOST_PP_REST_N is needed to allows for 0 as n, which is not used here, so (nil) can be eliminated. As a consequence, arithmetic ssaturation can be avoided as well.

Cheers

Wolf Lammen

in reply to:  1 comment:2 by Wolf Lammen <ookami1@…>, 12 years ago

a) some include's should be present to keep the code self contained

b) use explicit macro name BOOST_PP_SEQ_REST_N_PLUS_1

******
#  *                                                                          *
#  *     (C) Copyright Paul Mensonides 2002.
#  *     Distributed under the Boost Software License, Version 1.0. (See
#  *     accompanying file LICENSE_1_0.txt or copy at
#  *     http://www.boost.org/LICENSE_1_0.txt)
#  *                                                                          *
#  ************************************************************************** */
#
# /* See http://www.boost.org for most recent version. */
#
# ifndef BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
# define BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
#
# include <boost/preprocessor/arithmetic/inc.hpp>
# include <boost/preprocessor/config/config.hpp>
# include <boost/preprocessor/seq/detail/split.hpp>
# include <boost/preprocessor/seq/first_n.hpp>
# include <boost/preprocessor/facilities/empty.hpp>
# include <boost/preprocessor/tuple/elem.hpp>
#
# /* BOOST_PP_SEQ_REPLACE */
#
# if ~BOOST_PP_CONFIG_FLAGS() & BOOST_PP_CONFIG_EDG()
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_N_PLUS_1(i, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# else
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_REPLACE_I(seq, i, elem)
#    define BOOST_PP_SEQ_REPLACE_I(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_NP1(i, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1(i, seq) BOOST_PP_SEQ_REST_N_PLUS_1_I(n, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1_I(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# endif
#
# endif


comment:3 by ookami1@…, 12 years ago

somehow I missed a couple of characters when copying & pasting the code. Make sure the first (commentary) line looks more like

#  /***************************************************************************

comment:4 by Wolf Lammen <ookami1@…>, 12 years ago

wrote a test suite checking my fix. It found a bug in the EDG branch. Here's the updated file:

#  ****************************************************************************
#  *                                                                          *
#  *     (C) Copyright Paul Mensonides 2002.
#  *     Distributed under the Boost Software License, Version 1.0. (See
#  *     accompanying file LICENSE_1_0.txt or copy at
#  *     http://www.boost.org/LICENSE_1_0.txt)
#  *                                                                          *
#  ************************************************************************** */
#
# /* See http://www.boost.org for most recent version. */
#
# ifndef BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
# define BOOST_PREPROCESSOR_SEQ_REPLACE_HPP
#
# include <boost/preprocessor/arithmetic/inc.hpp>
# include <boost/preprocessor/config/config.hpp>
# include <boost/preprocessor/seq/detail/split.hpp>
# include <boost/preprocessor/seq/first_n.hpp>
# include <boost/preprocessor/facilities/empty.hpp>
# include <boost/preprocessor/tuple/elem.hpp>
#
# /* BOOST_PP_SEQ_REPLACE */
#
# if ~BOOST_PP_CONFIG_FLAGS() & BOOST_PP_CONFIG_EDG()
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_N_PLUS_1(i, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# else
#    define BOOST_PP_SEQ_REPLACE(seq, i, elem) BOOST_PP_SEQ_REPLACE_I(seq, i, elem)
#    define BOOST_PP_SEQ_REPLACE_I(seq, i, elem) BOOST_PP_SEQ_FIRST_N(i, seq) (elem) BOOST_PP_SEQ_REST_NP1(i, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1(n, seq) BOOST_PP_SEQ_REST_N_PLUS_1_I(n, seq)
#    define BOOST_PP_SEQ_REST_N_PLUS_1_I(n, seq) BOOST_PP_TUPLE_ELEM(2, 1, BOOST_PP_SEQ_SPLIT(BOOST_PP_INC(n), seq BOOST_PP_EMPTY))()
# endif
#
# endif


comment:5 by Edward Diener, 7 years ago

Fixed in the latest preprocessor code on the 'develop' branch.

in reply to:  5 comment:6 by anonymous, 7 years ago

Replying to eldiener:

Fixed in the latest preprocessor code on the 'develop' branch.

I cannot check the develop branch as I use only git, so I don't know whether you spotted the following yourself.

5 years ago I wrote "wrote a test suite checking my fix. It found a bug in the EDG branch", because the renaming BOOST_PP_SEQ_REST_NP1 -> BOOST_PP_SEQ_REST_N_PLUS_1 wasn't carried out completely.

Unfortunately, the code in comment5 isn't the fixed one. It still refers to BOOST_PP_SEQ_REST_NP1 in one place, like the one in comment3.

I am awfully sorry for that.

Wolf Lammen

comment:7 by Edward Diener, 7 years ago

| "I cannot check the develop branch as I use only git"

I do not know what this means.

The 'develop' branch uses Boost on Github, which means Boost now uses git. My solution does not involve your patches.

For getting the latest Boost source on git, see https://svn.boost.org/trac/boost/wiki/ModularBoost and https://svn.boost.org/trac/boost/wiki/TryModBoost.

in reply to:  7 comment:8 by ookami1@…, 7 years ago

Replying to eldiener:

| "I cannot check the develop branch as I use only git"

I do not know what this means.

I checked out 'develop' on the top level directory, which did not affect the subdirectory 'libs/preprocessor'.

After checking out 'develop' in 'libs/preprocessor' I can see your changes now.

Seems I did not understand the organization of the boost source tree at once. So, please, forgive me. I just posted the comment in order to prevent a glitch making its way into boost sources.

Wolf Lammen

comment:9 by ookami1@…, 7 years ago

After having looked at your solution, here are my comments:

Lets look at BOOST_PP_SEQ_REPLACE((a),0,b)

I can tell from the list of replacements that you somewhere in the middle call

BOOST_PP_SEQ_FIRST_N(0,(a))

expecting it to return an empty replacement.

Ok, we know both, BOOST_PP_SEQ_FIRST_N does the right thing by pure chance and returns an empty sequence. But given your formalistic stance, shouldn't usage of index 0 with BOOST_PP_SEQ_FIRST_N be banned as much as BOOST_PP_SEQ_REST_N is with an index == size of sequence? Don't you rely on undefined behavior as well?

And your solution comes with a price: The number of replacements increases from 28 to 38 in the above simple case, goes up from 57 to 117 in a 24-element sequence, such getting worse by the size. Another minor fault is the different spacing in the part before a replacement and after it: "(a) (a) (a) (b) (a)(a)(a)". This is less of a problem as long as you do just macro magic, but becomes essentiell once you stringize a sequence as I did.

See, you and Mr Mensonides had difficulties in getting these things right, let alone an averaged programmer like me. If you ask me, I still prefer a change in interface and let BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N either return a sequence, if exists, or an empty string, in the range 0..size of seq. This is the right result in all instances where you concatenate sequences, and it does not hurt elsewhere.

cheers Wolf

comment:10 by Edward Diener, 7 years ago

I agree with you that I am relying on what I tell the user is undefined behavior. But that does not mean that the user should rely on that behavior. In particular I do not want the user to rely on such behavior.

You talk about a change in interface as if it exists in an ideal vacuum. What happens when that change of interface starts breaking current code, such as thousands of lines that used to work fine ? No, sir, while I might be willing to add some new macro I am not going to break anything which people using Boost PP rely on.

In your approach an empty seq is perfectly fine, despite the fact that testing for emptiness is highly flawed when variadic macros are not supported by the compiler. I know about this issue as my entire Boost.VMD library is based around the fact that for a compiler which does support variadic macros ( which VMD must have ) testing for emptiness is still very slightly flawed, but the VMD library can live with that flaw. In Boost PP allowing a seq to be empty, ( or allowing a tuple to be empty ), is going to cause all sorts of problems in Boost PP simply because it cannot be reliably tested in a library which does support non-variadic macros.

I could add to VMD a tuple or a seq which could be empty, because then those who use VMD functionality to work with such a seq or tuple will not be involved with code breakage, but I am not going to do it with Boost PP.

If you want to propose a separate macro to be added to Boost PP please feel free to do it. I admit I did not look deeply into your solution because I was looking for the easiest fix. If your fix is much more efficient, using either a separate public macro, or a separate internal macro, I will look at this issue again. But please realize that having empty seqs ( or tuples ) in current Boost PP, or changing how the public macro BOOST_PP_SEQ_REST_N works or should be used, is not an option.

comment:11 by ookami1@…, 7 years ago

In your approach an empty seq is perfectly fine, despite the fact that testing for emptiness is highly flawed..

I'm well aware of the limitations of empty sequences, and therefore do not propose their support. As I wrote in #4421

"seq_first_rest_n.patch fixes BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N, so they both accept sequences of length 256 now. As a bonus feature, for preprocessors sufficiantly handling empty arguments (or those expanding to nothing), both macros process empty sequences as well. Usage of empty sequences is outside of the specs of the preprocessor code, and is not generally supported by the preprocessor library."

This is not about general support of empty sequences, let alone about breaking interfaces (When only did I propose that??). This is about more robustness. And that is seemingly needed, as you can see from the following bug, again because of bad usage of BOOST_PP_SEQ_REST_N (SEQ_256 is defined as above in the bug report)

// does not remove the last element
char x[] = BOOST_PP_STRINGIZE(BOOST_PP_SEQ_REMOVE(SEQ_256,255));

People, Boost programmers not excluded, seem to misunderstand/disregard the interface of these macros on a regular base, fortunately without consequences most of the time.

Ok, that is why I had preferred seeing after both macros. I guess they are mostly used when sequences are rearranged. In that context, at the head or tail of a sequence, an empty expansion (!= empty sequence) is a sensible result, at least better than the same result, occasionally replaced with junk.

And yes, I know that variadic macros provide better support. And that template programming is often a viable replacement for macros. This here is all about technology from the past decade. The bug number stands for that.

comment:12 by Edward Diener, 7 years ago

What do you mean by:

"BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N, so they both accept sequences of length 256 now" ?

If there is a bug involving them and a seq of 256 elements would you please file separate bug reports or make a pull request on the preprocessor 'develop' branch and I will look at it.

Similarly if BOOST_PP_SEQ_REMOVE fails in the situation above would you please file a separate bug report or make a pull request on the preprocessor 'develop' branch and I will look at it.

in reply to:  12 comment:13 by anonymous, 7 years ago

Replying to eldiener:

What do you mean by:

"BOOST_PP_SEQ_FIRST_N and BOOST_PP_SEQ_REST_N, so they both accept sequences of length 256 now" ?

If there is a bug involving them and a seq of 256 elements would you please file separate bug reports or make a pull request on the preprocessor 'develop' branch and I will look at it.

Meaning, I am repeating myself over and over again. I think, by now I made clear enough, what this is about.

Similarly if BOOST_PP_SEQ_REMOVE fails in the situation above would you please file a separate bug report or make a pull request on the preprocessor 'develop' branch and I will look at it.

You can safely drop the 'if'.

I did not present the bug in order to set off another round in symptom hunting and curing. I wanted to prove by evidence that not adhering to http://en.wikipedia.org/wiki/Design_by_contract has its decent downsides. IMHO either fixing 1 serving macro or hundreds of invocations is in order.

I am certain you have good reason to address this issue in a different manner, but as far as I am concerned, I am out of business.

Wolf Lammen

comment:14 by Edward Diener, 7 years ago

I have fixed the problem with BOOST_PP_SEQ_REMOVE on the 'develop' branch. I also went through all the basic seq operations dealing with changes to a seq of the maximum of 256 elements to make sure they now work properly.

comment:15 by Edward Diener, 7 years ago

The problem is now fixed in the latest Boost 1.59 release.

comment:16 by Edward Diener, 7 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.