wiki:soc/2007/UserFriendlyGraph

Version 8 (modified by jsiek, 15 years ago) ( diff )

--

I am using this place to publish thoughts and updates about the design and implementation of this project. If you happen to be reading this and have some comments about the content, ideas, or questions, please leave them here - I'd like to read them.

New Graph Classes

I am currently considering alternative approaches to the implementation of the undirected_graph and directed_graph classes. One thing is clear - i intend to leverage as much existing code in Boost.Graph as possible, so it's probable that these classes will be implemented in terms of the adjacency_list<> class. The irony to this approach is that I am not introducing new classes so much as additional abstractions around an existing class. On the other hand, it does make implementation a breeze.

Since I'm planning on reusing the adjacency_list<> for this project, I should point out that its template parameters are in a somewhat strange order. I don't fully understand why the out edge list selector would be first and the edge list selector would be last (even after the vertex, edge and graph properties). I suspect that the edge list was templated as an afterthought and to preserve source code compatibility, it was added to the end of the template parameter list. Still, it's just a little ugly if you actually plan to specialize the class beyond its default types.

(Jeremy: Yes, the edge list selector was added much later, in response to a user request, and had to go at the end to preserve backwards compatibility, the blessing and curse that it is!)

Question: Should the [un]directed graph classes really allow the override of the underlying collections? If the intent of these classes is to get the users up and running, then perhaps those options could be seen as optimizations rather than algorithm design choices. Since these classes can act as drop-in replacements (and are in fact, implemented in terms of an adjacency list), it should be relatively easy to "upgrade" if the user really needs to change those options. Of course, changing the storage options actually affects properties like iterator and descriptor stability, but that's going to be a gotcha anyways.

(Jeremy: I don't think it is a good idea to allow the override... user's can go directly to adjacency_list if they want that level of customization. The iterator stability is an important issue and I'd recommend specifying that the user-friendly graph classes have to preserve stability.)

Question: Should the [un]directed classes implement or provide checks to ensure that graphs are simple or contain no self-loops? How can this be managed? My gut feeling is that a good answer to this question might be a complete redesign of the graph concept structure. For exmaple, graph properties might be encapsulated in a struct of tags containing items like allow directedness, parallel edge support, self-loop support, etc. This is probably something to think about in the long-term.

(Jeremy: I would not prohibit self-loops unless you found a very good motivation. Otherwise you're just getting in the user's way.)

I'm going to use the following abbreviations for template parameters quite frequently:

  • VL - The type of the vertex list. The vertex list stores all of the vertices of the graph.
  • EL - The type of the edge list. Like the vertex list, this stores all the edges in the graph.
  • OEL - The type of the out edge list. Each vertex has a corresponding out edge list (hence the name adjacency list) that stores its outgoing edges.
  • VP - The type of the vertex property (either internal or bundled). Each vertex is associated with an instance of this type.
  • EP - The type of the edge property (also internal or bundled). Each edge is associated with an instance of this type.
  • GP - The type of the graph property. These are "global" to the entire graph.

Rather than embed all of the design iterations into a single page, I've decided to give each their own page so they can be cleanly separated from the rest of the work here.

Measures and Algorithms

I haven't given a lot of thought to these since the semester just ended. However, I am starting to consider various type requirements for the algorithms and have spent some time playing with conceptg++. I did notice however, that there is a ticket already exists for finding cliques within a graph (#693) - even better, with links to source code. It might be worth pointing out, that the Bron and Kerbosch algorithm is written for a Boolean matrix so interpreting it for an adjacency list is a bit of a chore. I haven't actually read the other algorithm yet, hopefully it's promising.

(Jeremy: btw, there is a matrix-as-graph adaptor in the BGL, and it would be straightforward to provide a graph-as-matrix adaptor.)

On a side note, I started codifying the Boost.Graph graph concepts into a concepts header just to see if I could make it work - I haven't succeeded yet. Still, it's a good excercise and makes you think about concept hierarchies, nested requirements and the like. Also, i don't like the keyword concept_map. Not the idea itself - the keyword. If I'm not mistaken, this keyword replaces model from earlier proposals. Alas...

(Jeremy: very cool! Yes, the concept_map keyword royally sucks. It turned out that model was already used in tons of C++ code, so we'd break that code if we turned model into a keyword.)

Since I've been codigying graph concepts, I should point out that IncidenceGraph and EdgeListGraph both define source() and target() methods, so they're a little redundant. I think it would be appropriate to remove those methods from the requirements of EdgeListGraph since they don't really have anything to do with the listing or enumerating of edges. Just a thought...

(Jeremy: actually, source() and target() are pretty important to that concept. It would be better to refactor them into a common base concept.)

Documentation and Other Niceties

It seems to me that the Boost.Graph documentation has gotten a little stale and needs a face-lift (i.e., rewrite to quickbook). There are also some older parts of the documentation that don't appear to align with current practice - especially those parts related to internal or bundled properties. If you read closely, you get the idea that bundled properties are the preferred technique, but they have yet to earn an entry into the table of contents.

(Jeremy: yes, it would be good to make the bundled properties more visible and better documented.)

Of course, the documentation isn't the only place where Boost.Graph doesn't align with the rest of Boost. For example, there is not boost::graph namespace. While this isn't exactly a tragedy for the ages, it would provide a clean separation from the rest of Boost. I am considering implementing my [un]directed_graphs in a boost::graph namespace.

(Jeremy: yes, I'm in favor of going with a boost::graph namespace. It turns out that back when the BGL was added to Boost we had not yet developed boost guidelines for namespace use.)

Another advantage of adding the boost::graph namespace is that it provides a mechansim for doing some other interesting things like reordering the parameters of common Boost.Graph function calls. One of my biggest complaints when I started learning the library was that the graph was always passed as the last argument in these functions. This doesn't really mesh well with the common "C-style" object-oriented approach where the type of the first parameter generally indicates the type to which said function would belong as a method. Is it really that big of a deal? I don't know... but it would be pretty easy to reorder parameters likes this (for example):

namespace boost { 
  namespace graph {
    template <class C>
    inline std::pair<typename C::edge_descriptor, bool>
    add_edge(boost::directed_graph_helper<C>& g, typename C::vertex_descriptor u, C::vertex_descriptor v)
    {
      return boost::add_edge(u, v, g);
    }
  }
}

Just a thought... There's probably some reason why this might break that I haven't thought of. And yes... that's what the signature of the add_edge() function actually looks like for directed graphs.

(Jeremy: I'm not a fan of changing the parameter ordering as that change would have to propagate throughout the BGL, and it would break lots of user code.)

Note: See TracWiki for help on using the wiki.