Opened 10 years ago

Closed 5 years ago

#6868 closed Bugs (fixed)

lex tokenize result error

Reported by: icosagon Owned by: Joel de Guzman
Milestone: To Be Determined Component: spirit
Version: Boost 1.49.0 Severity: Problem
Keywords: spirit lex Cc:

Description

if uncomment the line #include <boost/spirit/include/phoenix_statement.hpp> the tokenize result is correct, why?

the result is lines: 7, words: 0, characters: 155

the correct result is lines: 7, words: 33, characters: 162

visual studio 2010

// #define BOOST_SPIRIT_LEXERTL_DEBUG
 
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>

//if uncomment next line, the tokenize result is correct, why?
//#include <boost/spirit/include/phoenix_statement.hpp>

#include <boost/spirit/include/phoenix_algorithm.hpp>
#include <boost/spirit/include/phoenix_core.hpp>

#include <iostream>
#include <fstream>
#include <string>

namespace lex = boost::spirit::lex;

struct distance_func
{
	template <typename Iterator1, typename Iterator2>
	struct result : boost::iterator_difference<Iterator1> {};

	template <typename Iterator1, typename Iterator2>
	typename result<Iterator1, Iterator2>::type 
		operator()(Iterator1& begin, Iterator2& end) const
	{
		return std::distance(begin, end);
	}
};
boost::phoenix::function<distance_func> const distance1 = distance_func();

template <typename Lexer>
struct word_count_tokens : lex::lexer<Lexer>
{
	word_count_tokens()
		: c(0), w(0), l(0)
		, word("[^ \t\n]+")     
		, eol("\n")
		, any(".")
	{
		using boost::spirit::lex::_start;
		using boost::spirit::lex::_end;
		using boost::phoenix::ref;

		this->self 
			=   word  [++ref(w), ref(c) += distance1(_start, _end)]
		|   eol   [++ref(c), ++ref(l)] 
		|   any   [++ref(c)]
		;
	}

	std::size_t c, w, l;
	lex::token_def<> word, eol, any;
};

int main(int argc, char* argv[])
{
	typedef 
		lex::lexertl::token<char const*, lex::omit, boost::mpl::false_> 
		token_type;

	typedef lex::lexertl::actor_lexer<token_type> lexer_type;

	word_count_tokens<lexer_type> word_count_lexer;

	std::string str ("Our hiking boots are ready.  So, let's pack!\n\
					 \n\
		Have you the plane tickets for there and back?\n\
		\n\
		I do, I do.  We're all ready to go.  Grab my hand and be my beau.\n\
		\n\
		\n");
	char const* first = str.c_str();
	char const* last = &first[str.size()];

	lexer_type::iterator_type iter = word_count_lexer.begin(first, last);
	lexer_type::iterator_type end = word_count_lexer.end();

	while (iter != end && token_is_valid(*iter))
		++iter;

	if (iter == end) {
		std::cout << "lines: " << word_count_lexer.l 
			<< ", words: " << word_count_lexer.w 
			<< ", characters: " << word_count_lexer.c 
			<< "\n";
	}
	else {
		std::string rest(first, last);
		std::cout << "Lexical analysis failed\n" << "stopped at: \"" 
			<< rest << "\"\n";
	}
	return 0;
}

Change History (2)

comment:1 by Nikita Kniazev <nok.raven@…>, 5 years ago

I tried your example and it run fine (except #6869) and I have got lines: 7, words: 33, characters: 178 in both cases. I'm sorry I'll not going to find the commit where it was fixed (it might was phoenix v2 problem).

comment:2 by Joel de Guzman, 5 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.