Opened 13 years ago

Last modified 10 years ago

#3265 new Feature Requests

parse vectors

Reported by: Diederick C. Niehorster <dcnieho@…> Owned by: Vladimir Prus
Milestone: Boost 1.40.0 Component: program_options
Version: Boost 1.39.0 Severity: Optimization
Keywords: Cc: dcnieho@…

Description

As discussed on the boost users mailinglist, when a user specified a vector as return target in their options_description, that is:

("opt", po::value<float>&fA), "")

instead of

("vecopt", po::value<vector<float>>&vfa), "")

a simple syntax for specifying such a vector (that is, an option that can occur multiple times), on the command line, in a config file or even in an environment variable (sic!) can be used.

Compare (for command line) [--test "(1 2 3)"] with [--test 1 --test 2 --test 3].

Adding support for this syntax in the validate() function for vectors ensures that the braces (feel free to change them to another symbol that makes better sense to you) will only be processed specially when the user specified a vector as output variable. It will not break any existing option input systems defined by users.

Note that it will still be possible to specify options multiple times, the values of which will be accumulated in the vector. One could even supply multiple vector-syntax inputs for the same option, which would then get concatenated together.

The code is not yet totally finished, I need to add support for handling vectors of strings, marked as TODO in the code.

proposed diff:

Index: value_semantic.hpp
===================================================================
--- value_semantic.hpp  (revision 54915)
+++ value_semantic.hpp  (working copy)
@@ -8,6 +8,8 @@

 #include <boost/throw_exception.hpp>

+#include <boost/algorithm/string.hpp>
+
 namespace boost { namespace program_options {

     extern BOOST_PROGRAM_OPTIONS_DECL std::string arg;
@@ -124,7 +126,8 @@
 #endif

     /** Validates sequences. Allows multiple values per option occurrence
-       and multiple occurrences. */
+       and multiple occurrences. This function is only called when user
+       supplied vector<T> as datatype in option_description */
     template<class T, class charT>
     void validate(boost::any& v,
                   const std::vector<std::basic_string<charT> >& s,
@@ -142,11 +145,37 @@
                 /* We call validate so that if user provided
                    a validator for class T, we use it even
                    when parsing vector<T>.  */
-                boost::any a;
-                std::vector<std::basic_string<charT> > v;
-                v.push_back(s[i]);
-                validate(a, v, (T*)0, 0);
-                tv->push_back(boost::any_cast<T>(a));
+
+                vector<std::basic_string<charT>> value;
+
+                // test if vector notation is used in input
+                if (*s[i].begin() == '(' && *s[i].rbegin() == ')')
+                {
+                    // test if it is a vector of strings
+                    if (is_same<T, string>::value||is_same<T, wstring>::value)
+                    {
+                        /** TODO: needs special treatment, cant simply split
+                            on space character
+                            For now, proceed as before */
+                        value.push_back(s[i]);
+                    }
+                    else
+                    {
+                        split( value, s[i].substr(1, s[i].size()-2), is_any_of(
" ") );
+                    }
+                }
+                else
+                    value.push_back(s[i]);
+
+                // validate option values
+                for (unsigned j = 0; j < value.size(); j++)
+                {
+                    boost::any a;
+                    std::vector<std::basic_string<charT> > v;
+                    v.push_back(value[j]);
+                    validate(a, v, (T*)0, 0);
+                    tv->push_back(boost::any_cast<T>(a));
+                }
             }
             catch(const bad_lexical_cast& /*e*/) {
                 boost::throw_exception(invalid_option_value(s[i]));

Change History (4)

comment:1 by Diederick C. Niehorster <dcnieho@…>, 13 years ago

Cc: dcnieho@… added

Full proposed patch, now including support for vectors of strings with proper double quote support.

Index: value_semantic.hpp
===================================================================
--- value_semantic.hpp  (revision 54915)
+++ value_semantic.hpp  (working copy)
@@ -8,6 +8,8 @@

 #include <boost/throw_exception.hpp>

+#include <boost/algorithm/string.hpp>
+
 namespace boost { namespace program_options {

     extern BOOST_PROGRAM_OPTIONS_DECL std::string arg;
@@ -124,7 +126,8 @@
 #endif

     /** Validates sequences. Allows multiple values per option occurrence
-       and multiple occurrences. */
+       and multiple occurrences. This function is only called when user
+       supplied vector<T> as datatype in option_description */
     template<class T, class charT>
     void validate(boost::any& v,
                   const std::vector<std::basic_string<charT> >& s,
@@ -142,11 +145,41 @@
                 /* We call validate so that if user provided
                    a validator for class T, we use it even
                    when parsing vector<T>.  */
-                boost::any a;
-                std::vector<std::basic_string<charT> > v;
-                v.push_back(s[i]);
-                validate(a, v, (T*)0, 0);
-                tv->push_back(boost::any_cast<T>(a));
+
+                vector<std::basic_string<charT>> value;
+
+                // test if vector notation is used in input
+                if (*s[i].begin() == '(' && *s[i].rbegin() == ')')
+                {
+                    // test if a vector of strings is requested
+                    if (is_same<T, string>::value||is_same<T, wstring>::value)
+                    {
+                        /** Strings needs special treatment, cant simply split
+                            on space character as that might be part of the
+                            string. Using split_winmain we can allow for the
+                            same grammar within the brackets () as for strings
+                            on the command line, that is, e.g., use "" to
+                            encapsulate strings with spaces. For complete
+                            grammar, see links in the split_winmain source. */
+                        value = split_winmain(s[i].substr(1, s[i].size()-2));
+                    }
+                    else
+                    {
+                        split( value, s[i].substr(1, s[i].size()-2), is_any_of(
" ") );
+                    }
+                }
+                else
+                    value.push_back(s[i]);
+
+                // validate option values
+                for (unsigned j = 0; j < value.size(); j++)
+                {
+                    boost::any a;
+                    std::vector<std::basic_string<charT> > v;
+                    v.push_back(value[j]);
+                    validate(a, v, (T*)0, 0);
+                    tv->push_back(boost::any_cast<T>(a));
+                }
             }
             catch(const bad_lexical_cast& /*e*/) {
                 boost::throw_exception(invalid_option_value(s[i]));}}}

comment:2 by Sascha Ochsenknecht, 13 years ago

Here is the archived discussion on boost users list: boost user

comment:3 by Diederick C. Niehorster <dcnieho@…>, 11 years ago

Dear Volodya and/or Sascha,

I still have interest in this addition and am using code made possible by it daily, so far I haven't run into any problems. Could you consider looking at this and including it?

Thank you very much for your time, let me know if I can help in any way!

Best, Dee

comment:4 by Diederick C. Niehorster <dcnieho@…>, 10 years ago

Hmm, here is a slightly updated version of the patch. Some usage of std wasn't prefixed with std:: and the call to split should split on any whitespace (tabs are fine too after all) and have token compressing on in case e.g. multiple spaces delimit two entries.

Index: value_semantic.hpp
===================================================================
--- value_semantic.hpp  (revision 54915)
+++ value_semantic.hpp  (working copy)
@@ -8,6 +8,8 @@

 #include <boost/throw_exception.hpp>

+#include <boost/algorithm/string.hpp>
+
 namespace boost { namespace program_options {

     extern BOOST_PROGRAM_OPTIONS_DECL std::string arg;
@@ -124,7 +126,8 @@
 #endif

     /** Validates sequences. Allows multiple values per option occurrence
-       and multiple occurrences. */
+       and multiple occurrences. This function is only called when user
+       supplied vector<T> as datatype in option_description */
     template<class T, class charT>
     void validate(boost::any& v,
                   const std::vector<std::basic_string<charT> >& s,
@@ -142,11 +145,41 @@
                 /* We call validate so that if user provided
                    a validator for class T, we use it even
                    when parsing vector<T>.  */
-                boost::any a;
-                std::vector<std::basic_string<charT> > v;
-                v.push_back(s[i]);
-                validate(a, v, (T*)0, 0);
-                tv->push_back(boost::any_cast<T>(a));
+
+                std::vector<std::basic_string<charT>> value;
+
+                // test if vector notation is used in input
+                if (*s[i].begin() == '(' && *s[i].rbegin() == ')')
+                {
+                    // test if a vector of strings is requested
+                    if (is_same<T, std::string>::value||is_same<T, std::wstring>::value)
+                    {
+                        /** Strings needs special treatment, cant simply split
+                            on space character as that might be part of the
+                            string. Using split_winmain we can allow for the
+                            same grammar within the brackets () as for strings
+                            on the command line, that is, e.g., use "" to
+                            encapsulate strings with spaces. For complete
+                            grammar, see links in the split_winmain source. */
+                        value = split_winmain(s[i].substr(1, s[i].size()-2));
+                    }
+                    else
+                    {
+                        split( value, s[i].substr(1, s[i].size()-2), is_space(), token_compress_on );
+                    }
+                }
+                else
+                    value.push_back(s[i]);
+
+                // validate option values
+                for (unsigned j = 0; j < value.size(); j++)
+                {
+                    boost::any a;
+                    std::vector<std::basic_string<charT> > v;
+                    v.push_back(value[j]);
+                    validate(a, v, (T*)0, 0);
+                    tv->push_back(boost::any_cast<T>(a));
+                }
             }
             catch(const bad_lexical_cast& /*e*/) {
                 boost::throw_exception(invalid_option_value(s[i]));}}}

Best and thanks, Dee

Note: See TracTickets for help on using tickets.