Opened 6 years ago
Last modified 5 years ago
#12859 new Feature Requests
std::string_view as token
Reported by: | Owned by: | jsiek | |
---|---|---|---|
Milestone: | To Be Determined | Component: | tokenizer |
Version: | Boost 1.63.0 | Severity: | Optimization |
Keywords: | string_view | Cc: |
Description
When I use std::string_view
as a token for the tokenizer
, I get an error because it attempts to construct a std::string_view
with two iterators, which it doesn't support.
It'd be very nice to support this efficiency-enhacing type by, perhaphs, constructing the token from the underlying character array and a size when two iterators would be ill-formed.
Change History (5)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
This is my particular use case:
auto tokenize(std::string_view to_tokenize, const char* separators) { using TokenizerFunction = boost::char_separator<char>; using Tokenizer = boost::tokenizer< TokenizerFunction, std::string_view::iterator, std::string_view>; return Tokenizer{to_tokenize, TokenizerFunction{separators}}; }
I want the Tokenizer to recognize that the token type is not constructible from two iterators, and try to construct it from a {pointer,size} pair instead.
comment:3 by , 5 years ago
Few things here,
Base on what the tokenizer is, which basically it's a non owning helper that relies on the fact that someone else owns the std::string, and the connection between the owner and the tokenizer class is a pure pair of iterator, this make me think that it can be a little bit ambiguous, like, adding one layer between the std::string and the tokenizer, so instead of : Std::string -> boost::tokenizer now it will become std::string -> std::string_view -> boost::tokenizer. See my point? This could be a good discussion, I would like to know your thoughts about this. Also, from top of my mind I can think that this is still in most compilers experimental features, so that may be an issue to deal with at boost level. I would check that deep.
Anyway, I did play a little with this and I came up with some POC, which can’t be added to boost anyway as it wasn’t properly tested, but at least it helped me to see how it may look like. You can check the diff here -> https://github.com/boostorg/tokenizer/compare/develop...dmeden:not_safe_string_view_use_v1
Tested with GCC 6.3.
Again, this code is not boost ready yet, there are many things(iterators, etc) that needs to be tested and validated that I didn’t put it into the code yet.
Thanks, Dam
comment:4 by , 5 years ago
I'm starting to think that its better to leave the code as-is. This problem might be better solved elsewhere, like a std::string_view
wrapper constructible from a pair of iterators. Does this answer your RFC?
comment:5 by , 5 years ago
There was a discussion about that (having a constructor that takes a pair of iterators) in the standard:
Sure you can wrap it it to cover your needs. You can have a look at that thread and follow some notes they've exposed.
Thanks,
Damian.
Hi, please provide more information on what you are trying to achieve? a code sample of would be great(TokenizerFunc, etc). There are few things you can do with a string_view, but I do not want to do any answer before seeing what you are trying to.
Thanks Dam.