Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#5566 closed Bugs (fixed)

Spirit.Lex converting a token to its value in a lexer semantic action fails

Reported by: Reko Tiira <reko@…> Owned by: Hartmut Kaiser
Milestone: To Be Determined Component: spirit
Version: Boost 1.46.1 Severity: Problem
Keywords: Cc:

Description

Discussed this on #boost with heller and he said to post a bug report about the issue. Here's a test case that should apparently work fine:

#include <string>
#include <vector>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>

namespace lex = boost::spirit::lex;

template <typename Lexer>
struct text_decorator_lexer : lex::lexer<Lexer>
{
        text_decorator_lexer()
                : tag_open("\\{\\{.+?\\}\\}")
                , tag_close("\\{\\{\\/.+?\\}\\}")
                , var("\\$\\$.+?\\$\\$")
                , any(".")
        {
                using boost::spirit::lex::_val;
                using boost::phoenix::push_back;
                using boost::phoenix::ref;

                this->self
                        = tag_open  [push_back(ref(tokens), _val)]
                        | tag_close [push_back(ref(tokens), _val)]
                        | var       [push_back(ref(tokens), _val)]
                        | any       [push_back(ref(tokens), _val)]
                        ;
        }

        std::vector<std::string> tokens;
        lex::token_def<std::string> tag_open, tag_close, var, any;
};

int main()
{
        typedef lex::lexertl::token<const char*, boost::mpl::vector<std::string>, boost::mpl::false_> token_type;
        typedef lex::lexertl::actor_lexer<token_type> lexer_type;

        text_decorator_lexer<lexer_type> text_decorator_tokenizer;
        std::string input = "Hello {{b}}there{{/b}}, the time is $$time$$.";

        char const* first = input.c_str();
        char const* last = &first[input.size()];
        lex::tokenize(first, last, text_decorator_tokenizer);
}

And the relevant part of the error is:

/opt/local/include/boost/spirit/home/phoenix/stl/container/container.hpp:492:40: error: no matching function for call to 'std::vector<std::basic_string<char> >::push_back(const boost::variant<boost::detail::variant::over_sequence<boost::mpl::v_item<boost::iterator_range<const char*>, boost::mpl::v_item<std::basic_string<char>, boost::mpl::vector0<mpl_::na>, 1>, 1> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>&)'
/opt/local/include/boost/spirit/home/phoenix/stl/container/container.hpp:492:40: note: candidate is:
/opt/local/include/gcc46/c++/bits/stl_vector.h:826:7: note: void std::vector<_Tp, _Alloc>::push_back(const value_type&) [with _Tp = std::basic_string<char>, _Alloc = std::allocator<std::basic_string<char> >, std::vector<_Tp, _Alloc>::value_type = std::basic_string<char>]
/opt/local/include/gcc46/c++/bits/stl_vector.h:826:7: note:   no known conversion for argument 1 from 'const boost::variant<boost::detail::variant::over_sequence<boost::mpl::v_item<boost::iterator_range<const char*>, boost::mpl::v_item<std::basic_string<char>, boost::mpl::vector0<mpl_::na>, 1>, 1> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>' to 'const value_type&'
/opt/local/include/boost/spirit/home/phoenix/stl/container/container.hpp:492:40: error: return-statement with a value, in function returning 'void' [-fpermissive]

Tested with:

gcc version 4.6.1 20110325 (prerelease) (GCC) gcc version 4.2.1 (Apple Inc. build 5666) (dot 3)

Change History (8)

comment:1 by Hartmut Kaiser, 11 years ago

Owner: changed from Joel de Guzman to Hartmut Kaiser

in reply to:  description comment:2 by Hartmut Kaiser, 11 years ago

Replying to Reko Tiira <reko@…>:

It does not exactly work as you expect as the type of teh token value is a variant (initially holding a pair of iterators pointing to the matched input sequence). You need to explicitly convert this iterator range into the attribute type. Normally this is done while parsing on first access to the attribute. Here is your modified code doing this:

#include <string>
#include <vector>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>

namespace lex = boost::spirit::lex;
namespace phoenix = boost::phoenix;


struct get_string_impl
{
    template <typename Value>
    struct result
    {
        typedef std::string type;
    };

    template <typename Value>
    std::string operator()(Value const& val) const
    {
        // transform the token value (here a variant) into a string
        // at this point the variant holds a pair of iterators
        typedef boost::iterator_range<char const*> iterpair_type;
        iterpair_type const& ip = boost::get<iterpair_type>(val);

        return std::string(ip.begin(), ip.end());
    }
};

boost::phoenix::function<get_string_impl> get_string;

template <typename Lexer>
struct text_decorator_lexer : lex::lexer<Lexer>
{
        text_decorator_lexer()
                : tag_open("\\{\\{.+?\\}\\}")
                , tag_close("\\{\\{\\/.+?\\}\\}")
                , var("\\$\\$.+?\\$\\$")
                , any(".")
        {
                using boost::spirit::lex::_val;
                using boost::phoenix::push_back;
                using boost::phoenix::ref;

                this->self
                        = tag_open  [push_back(ref(tokens), get_string(_val))]
                        | tag_close [push_back(ref(tokens), get_string(_val))]
                        | var       [push_back(ref(tokens), get_string(_val))]
                        | any       [push_back(ref(tokens), get_string(_val))]
                        ;
        }

        std::vector<std::string> tokens;
        lex::token_def<std::string> tag_open, tag_close, var, any;
};

int main()
{
        typedef lex::lexertl::token<
            const char*, 
            boost::mpl::vector<std::string>, 
            boost::mpl::false_
        > token_type;
        typedef lex::lexertl::actor_lexer<token_type> lexer_type;

        text_decorator_lexer<lexer_type> text_decorator_tokenizer;
        std::string input = "Hello {{b}}there{{/b}}, the time is $$time$$.";

        char const* first = input.c_str();
        char const* last = &first[input.size()];
        lex::tokenize(first, last, text_decorator_tokenizer);
}

But to be honest, I needed to fix a (unrelated) problem to make this code work, therefore you'll need to update from SVN or wait for Boost V1.47.

Last edited 11 years ago by Hartmut Kaiser (previous) (diff)

comment:3 by Hartmut Kaiser, 11 years ago

Resolution: fixed
Status: newclosed

(In [72201]) Spirit: fixed #5566: Spirit.Lex converting a token to its value in a lexer semantic action fails

comment:4 by Gilles <gillesb68@…>, 10 years ago

What I see from the token's variant is that the value replaces the pair of iterators pointing to the matched input sequence. Is there a way to store the converted value and, in the same time, keeping the pair of iterators pointing to the matched input sequence ?

in reply to:  4 ; comment:5 by Hartmut Kaiser, 10 years ago

Replying to Gilles <gillesb68@…>:

What I see from the token's variant is that the value replaces the pair of iterators pointing to the matched input sequence. Is there a way to store the converted value and, in the same time, keeping the pair of iterators pointing to the matched input sequence ?

Yes, simply use position_token instead of the plain token class.

in reply to:  5 ; comment:6 by Gilles <gillesb68@…>, 10 years ago

Replying to hkaiser:

Replying to Gilles <gillesb68@…>:

What I see from the token's variant is that the value replaces the pair of iterators pointing to the matched input sequence. Is there a way to store the converted value and, in the same time, keeping the pair of iterators pointing to the matched input sequence ?

Yes, simply use position_token instead of the plain token class.

I understand this for a token enclosing the pair pointing the matched input sequence alone, but how do I declare a token enclosing together :

1/ the pair of iterators pointing to the matched input sequence; and

2/ the converted value of the same input sequence ?

in reply to:  6 comment:7 by Hartmut Kaiser, 10 years ago

Replying to Gilles <gillesb68@…>:

Replying to hkaiser:

Replying to Gilles <gillesb68@…>:

What I see from the token's variant is that the value replaces the pair of iterators pointing to the matched input sequence. Is there a way to store the converted value and, in the same time, keeping the pair of iterators pointing to the matched input sequence ?

Yes, simply use position_token instead of the plain token class.

I understand this for a token enclosing the pair pointing the matched input sequence alone, but how do I declare a token enclosing together :

1/ the pair of iterators pointing to the matched input sequence; and

2/ the converted value of the same input sequence ?

Yes, that's what you can do with the position_token type. It is semantically equivalent to the token type, except for the additionally stored iterator pair of the matched input sequence. Please see libs/example/qi/compiler_tutorial/conjure3 for an example.

comment:8 by anonymous, 10 years ago

Ah, the position_token class, of course, I should have thought. Thanks a lot, and I apologize for making you wasting time about something so obvious.

Note: See TracTickets for help on using tickets.