Opened 7 years ago

Last modified 7 years ago

#11341 new Bugs

dimacs_basic_reader reads corrupted data

Reported by: anonymous Owned by: Jeremiah Willcock
Milestone: To Be Determined Component: graph
Version: Boost 1.58.0 Severity: Problem
Keywords: dimacs Cc:

Description

As analyzed for http://stackoverflow.com/questions/30415388/how-to-read-dimacs-vertex-coloring-graphs-in-c/30446685#30446685

The parsing of edge lines is fundamentally broken.

A simple file as

p edge 2 1
e 1 2

will fail to read correct data (without raising an error). Instead, indeterminate values are read for the edge, and the results are undefined (at the very least depend on the graph model).

Related, I don't think the expression (char*) buf.c_str() is legal. I'd suggest at least replacing that by &buf[0].

Further I'd really suggest replacing the parser. For the subset that the question was about, I've suggested two implementations in the linked answer on SO:

#include <string>
#include <istream>
#include <sstream>

template <typename Graph>
bool read_coloring_problem(std::istream& dimacs, Graph& g) {
    size_t vertices = 0, edges = 0;

    std::string line;
    while (getline(dimacs, line))
    {
        std::istringstream iss(line);
        char ch;
        if (iss >> ch)
        {
            size_t from, to;
            std::string format;

            switch(ch) {
                case 'c': break;
                case 'p': 
                    if (vertices||edges) return false;
                    if (iss >> format >> vertices >> edges) {
                        if ("edge" != format) return false;
                    }
                    break;
                case 'e': 
                    if (edges-- && (iss >> from >> to) && (add_edge(from-1, to-1, g).second))
                        break;
                default: 
                    return false;
            }
        }
    }

    return !(edges || !dimacs.eof());
}

Or with Boost Spirit:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/qi_match.hpp>

template <typename Graph>
bool read_coloring_problem(std::istream& dimacs, Graph& g) {
    auto add_edge_ = [&g](size_t from, size_t to) { add_edge(from, to, g); };
    size_t vertices = 0, edges = 0;

    using namespace boost::spirit::qi;
    namespace px = boost::phoenix;
    uint_parser<size_t> num_;

    auto eoil      = eol | eoi;
    auto comment   = boost::proto::deep_copy(lexeme["c " >> *(char_ - eol) >> eoil] | eol);
    auto vertices_ = px::ref(vertices);
    auto edges_    = px::ref(edges);

    dimacs >> std::noskipws >> phrase_match(
            *comment >>
            ("p" >> lit("edge") >> num_ [vertices_ = _1] >> num_ [edges_ = _1] >> eoil) >>
            repeat(edges_) [ 
            *comment >> ("e" >> num_ >> num_ >> eoil) [ px::bind(add_edge_, _1-1, _2-1) ]
            ] 
            >> *comment >> eoi
            , blank);

    return dimacs;
}

I realize there are missing features (weight)

Change History (1)

comment:1 by bugs@…, 7 years ago

Component: Nonegraph
Keywords: dimacs added
Owner: set to Jeremiah Willcock
Version: Boost 1.57.0Boost 1.58.0

(Oops. That went in prematurely. Site is really slow to handle)

I realize there are missing features (weights?), and perhaps enabling the iterator-range construction is faster (?). The non-spirit approach might works better for header-only application (due to compile times).

Let me know if you want help revisiting the parsing code here.

Note: See TracTickets for help on using tickets.