Opened 12 years ago

Last modified 11 years ago

#4899 new Bugs

Parallel graphs don't work with named vertices

Reported by: cpaolino@… Owned by: ngedmond
Milestone: To Be Determined Component: graph
Version: Boost 1.45.0 Severity: Problem
Keywords: parallel, distributed, betweenness, named vertices Cc: lana-dev@…

Description

brandes_betweenness_centrality doesn't work with named vertices.

Here are two reproducible examples: https://gist.github.com/f02f18f30f0eef146a58

This is probably caused by the lack of a local() function in the hashed_distribution declared in named_graph.hpp.

See also this discussion on the mailing list: http://lists.boost.org/boost-users/2010/11/64489.php

Change History (6)

comment:1 by cpaolino@…, 12 years ago

Another useful info might be, as Cedric Laczny pointed out on that mailing list thread:

…when I look at your error logs, I see

/usr/local/include/boost/graph/distributed/shuffled_distribution.hpp: In member function ‘size_t boost::graph::distributed::shuffled_distribution<BaseDistribution>::operator() (const T&) const [with T = size_t, BaseDistribution = boost::graph::distributed::hashed_distribution<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >]’:

The problem here is the collision between size_t and char* (std::basic_string). The argument should be of type size_t (index of a vertex for the local process) but the underlying distribution is based on std::basic_string, probably due to the named vertices (specialization in named_graph.hpp). This causes an error at line 40 of boost/graph/distributed/named_graph.hpp which is called by line 68 in boost/graph/distributed/shuffled_distribution.hpp. So the specialization to named vertices could be the cause.

While this might not resolve the issue with local(), it could actually track down one error and might reveal a bug.

comment:2 by Jeremiah Willcock, 12 years ago

Owner: changed from Andrew Sutton to ngedmond

comment:3 by Jeremiah Willcock, 12 years ago

Owner: changed from ngedmond to Jeremiah Willcock
Status: newassigned

I agree with comment 1 -- it seems like you are trying to use a distribution that uses numbers to index vertex locations (it appears to use an std::vector), while something is trying to find vertex locations using their names. Do any PBGL algorithms work with your named vertex example? How important is it to you to use named vertices as opposed to just vertex numbers?

comment:4 by Jeremiah Willcock, 12 years ago

Summary: Parallel Brandes Betweenness Centrality doesn't work with named vertices (hashed_distribution doesn't have `local()`)Parallel graphs don't work with named vertices

in reply to:  3 comment:5 by cpaolino@…, 12 years ago

Replying to jewillco:

I agree with comment 1 -- it seems like you are trying to use a distribution that uses numbers to index vertex locations (it appears to use an std::vector), while something is trying to find vertex locations using their names.

I think it's the other way around: I'm trying to use named graphs with an algorithm that not expects them.

Do any PBGL algorithms work with your named vertex example?

Yes, PageRank and Degree centrality (the same from the non-parallel BGL, with two line changes to make it work with distributed graphs; will make another bug report for that).

How important is it to you to use named vertices as opposed to just vertex numbers?

I'll quote myself and Cedric Laczny on that mailing list thread:

Cedric Laczny wrote:

Carmine Paolino wrote:

Yes, without this declaration (editor's note: that makes it a named graph) it does compile:

template<> struct internal_vertex_name<Vertex> { typedef multi_index::member<Vertex, std::string, &Vertex::name> type; };

But the comfort of having boost manage adding and finding vertexes also goes away, which means we have to use something like a map of all the inserted names on a single process to know if a name should be added to the graph or not. And that doesn't scale well. Anyway I'm open to suggestion on this if the named graph way doesn't work…

I agree with you on this one. I experienced the boost library to be very efficient in all cases where I used it and the authors probably have thought about this in order to make it suitable (and fast) for distributed purposes. In fact, I find this feature very convenient and I think it may be also helpful in non-distributed graphs. Using already existing features is IMHO definitively nicer than putting all kinds of maps or such on top for lookup-purposes. It will only bloat the code and make it harder to read and maintain and probably will steal preformance. That's why it is important to check if this is actually a bug :)

comment:6 by Jeremiah Willcock, 11 years ago

Owner: changed from Jeremiah Willcock to ngedmond
Status: assignednew
Note: See TracTickets for help on using tickets.