Boost C++ Libraries: Ticket #521: [Regex] Splitting string: last empty token is "compressed"

Last empty token is "compressed" when using regex_token_iterator for splitting strings. BCB 5.64, boost 1.33.0

See the example, compare it with string_algo splitting:

#include &lt;boost\regex.hpp&gt;
#include &lt;string&gt;
#include &lt;vector&gt;
#include &lt;iostream&gt;
#include &lt;boost/algorithm/string/finder.hpp&gt;
#include &lt;boost/algorithm/string/find_iterator.hpp&gt;

void with_stringAlgo()
{
  using namespace std;
  vector&lt;string&gt; result;
  string to_split("&amp;|&amp;_field2_&amp;|&amp;_field3_&amp;|&amp;");
  boost::split_iterator&lt;string::iterator&gt; i(to_split.begin(),to_split.end(),boost::first_finder("&amp;|&amp;")),j;
  for(;i!=j;++i)
  {
    result.push_back(boost::copy_range&lt;std::string&gt;(*i));
  }
  cout&lt;&lt;"size is "&lt;&lt;result.size()&lt;&lt;endl;
  for(int i=0;i&lt;result.size();++i)
    cout&lt;&lt;result[i]&lt;&lt;endl;
}

void with_regex()
{
  using namespace std;
  using namespace boost;
  using namespace boost::regex_constants;
  string s("&amp;|&amp;_field2_&amp;|&amp;_field3_&amp;|&amp;");
  boost::regex r("&amp;\\|&amp;");//use &amp;|&amp; as delimiter
  boost::sregex_token_iterator i(s.begin(),s.end(),r,-1, match_default),j;
  //
  vector&lt;string&gt; v;
  copy(i,j,back_inserter(v));
  //
  cout&lt;&lt;"size is "&lt;&lt;v.end()-v.begin()&lt;&lt;endl;
  copy(v.begin(),v.end(),ostream_iterator&lt;string&gt;(cout,"\n"));
}

int main()
{
  with_stringAlgo();
  with_regex();
}

John Maddock Wed, 23 Nov 2005 10:47:22 GMT

It's by design, and that's the way we codified things in the C++ Standard Technical Report 1 (TR1) so it's not going to change now, unless the TR1 does of course. The rational is that you often have a series of fields each of which is terminated by a specific string. In this case you want an empty last field to be suppressed.

John.