Opened 10 years ago

Closed 10 years ago

#6886 closed Bugs (worksforme)

Bugs in boost/regex/icu.hpp (patch included)

Reported by: Martin Baute <solar@…> Owned by: John Maddock
Milestone: To Be Determined Component: regex
Version: Boost 1.49.0 Severity: Problem
Keywords: Cc:

Description

I am 100% sure I already reported these a couple of years ago (when 1.45 was current), but I just realized they haven't been fixed yet.

In boost/regex/icu.hpp:

Line 238 has a preprocessor #if that does special handling if BOOST_NO_MEMBER_TEMPLATES or IBMCPP is defined. That's why the buggy code doesn't show itself on "average" machines. However, when trying to compile Boost.regex with ICU support on an AIX machine, compilation fails due to two bugs in the #else part of the code.

The first bug is in line 314:

typedef std::vector<UCHAR32> vector_type;

The type is UChar32, not UCHAR32. The two templates before this one got it right, this template got it wrong.

The second bug is in line 318:

v.push_back((UCHAR32)(*i));

Same as above - should be UChar32.

The third bug is a classic copy&paste error. In line 319:

++a;

If you look at the surrounding code, there is no "a" defined here. The two previous templates define a 32-bit integer "a" by type-converting the "i" parameter (which is 8-bit in the first template and 16-bit in the second one), but in this third template "i" is 32-bit itself already, so there is no need to convert it. You see the surrounding "while" loop and the push_back() call using "i" directly, and the increment should do the same.

The patch:

314c314 < typedef std::vector<UCHAR32> vector_type; ---

typedef std::vector<UChar32> vector_type;

318,319c318,319 < v.push_back((UCHAR32)(*i)); < ++a; ---

v.push_back((UChar32)(*i)); ++i;

CVS Web link for ease of reference:

http://boost.cvs.sourceforge.net/viewvc/boost/boost/boost/regex/icu.hpp?view=markup

It would be nice if someone could commit this patch, because this bug has been around since v1.33...

Change History (5)

comment:1 by Martin Baute <solar@…>, 10 years ago

Whoops, formatting... the patch:

314c314
<    typedef std::vector<UCHAR32> vector_type;
---
>    typedef std::vector<UChar32> vector_type;
318,319c318,319
<       v.push_back((UCHAR32)(*i));
<       ++a;
---
>       v.push_back((UChar32)(*i));
>       ++i;

We (my employer) have been using this patch for the last three or four years, so you could say it's tested. ;-)

comment:2 by Martin Baute <solar@…>, 10 years ago

Summary: Bugs in boost/regex/icu.hppBugs in boost/regex/icu.hpp (patch included)

comment:3 by John Maddock, 10 years ago

I believe all these issues have been fixed, note that the source code you referenced was in the old CVS repro which hasn't been used in many years. Current source is here: http://svn.boost.org/svn/boost/trunk/boost/regex/icu.hpp and doesn't contain UCHAR32 anywhere.

Please reopen if I've missed anything.

BTW, can the IBM specific workarounds be removed from recent compiler releases?

comment:4 by anonymous, 10 years ago

Ah, I see... perhaps it would be a good idea to link the source somewhere (more) prominent from boost.org, and shutting down the SF.net repo in the admin interface. (It was possible to do so when I last used SourceForge, which admittedly is a couple of years since.) I clicked around the site for some time looking for "my" patch, and eventually ended up in the SF.net repository thinking that the issue hadn't been fixed.

I'll give the new IBM compiler a try sometime next week.

comment:5 by John Maddock, 10 years ago

Resolution: worksforme
Status: newclosed
Note: See TracTickets for help on using tickets.