wiki:soc/2007/UserFriendlyGraph

Version 4 (modified by Andrew Sutton, 15 years ago) ( diff )

--

I am using this place to publish thoughts and updates about the design and implementation of this project. If you happen to be reading this and have some comments about the content, ideas, or questions, please leave them here - I'd like to read them.

New Graph Classes

I am currently considering alternative approaches to the implementation of the undirected_graph and directed_graph classes. One thing is clear - i intend to leverage as much existing code in Boost.Graph as possible, so it's probable that these classes will be implemented in terms of the adjacency_list<> class. The irony to this approach is that I am not introducing new classes so much as additional abstractions around an existing class. On the other hand, it does make implementation a breeze.

Since I'm planning on reusing the adjacency_list<> for this project, I should point out that its template parameters are in a somewhat strange order. I don't fully understand why the out edge list selector would be first and the edge list selector would be last (even after the vertex, edge and graph properties). I suspect that the edge list was templated as an afterthought and to preserve source code compatibility, it was added to the end of the template parameter list. Still, it's just a little ugly if you actually plan to specialize the class beyond its default types.

I'm going to use the following abbreviations quite frequently:

  • VL - The type of the vertex list. The vertex list stores all of the vertices of the graph.
  • EL - The type of the edge list. Like the vertex list, this stores all the edges in the graph.
  • OEL - The type of the out edge list. Each vertex has a corresponding out edge list (hence the name adjacency list) that stores its outgoing edges.
  • VP - The type of the vertex property (either internal or bundled). Each vertex is associated with an instance of this type.
  • EP - The type of the edge property (also internal or bundled). Each edge is associated with an instance of this type.
  • GP - The type of the graph property. These are "global" to the entire graph.

An early prototype of the design proposed to simply (publicly) inherit the adjacency_list class like so:

namespace boost {
  template <class VL, class EL, class OEL, class VP, class EP, class GP>
  class undirected_graph : public adjacency_list<OEL, VL, undirectedS, VP, EP, GP, EL>
  { };
}

Obviously, this probably the simplest implementation possible with as much reuse as possible. In this strategy, I wouldn't even have to implement any of the requirement methods since graphs are always passed by reference, we can let the inheritance handle this for us. The problem with this approach is that if we ever decide to put members into this class, we can run the risk of object-slicing due to blind reuse of methods that I didn't write. While the alure of a one-line implementation is strong, we should probably defer to practical experience and good advice (in the form of C++ Coding Standards: 101 Rules Guidelines and Best Practices by Sutter and Alexandrescu). I'm not overloading virtual functions, accessing protected members, and I'm pretty sure adjacency_list<> wasn't intended for inheritance.

A second design approach uses composition as a member rather than private inheritance - not for any particular reason, I'm just more familiar with composition. Now, the prototype looks like this:

namespace boost {
  template <class VL, class EL, class OEL, class VP, class EP, class GP>
  class undirected_graph
  {
    typedef adjacency_list<OEL, VL, undirectedS, VP, EP, GP, EL> graph_type;
    graph_type graph;

    public:
      graph_type& impl();
      const graph_type& impl();

    // other stuff so we comply with graph_traits<>
  };

  // example add_vertex() function...
  template <class VL, class EL, class OEL, class VP, class EP, class GP>
  undirected_graph<VL, EL, OEL, VP, EP, GP>::vertex_descriptor
  add_vertex(undirected_graph<VL, EL, OEL, VP, EP, GP> &g)
  {
    return add_vertex(g.impl());  // calls add_vertex() for adjacency_list
  }
}

Clearly a very different approach, but once complete it should preserve all of the concept requirements given in the documentation. Interestingly, this is somewhat similar to graph adapter classes except that we aren't really adapting the adjacency list to a different purpose - we're just reducing the "abstractness" of the class by requiring (in this case), that it is instantiated with the undirectedS parameter. This approach is also a little more flexible than the previous in that we can add to this class without fear of unexpected consequences in the details of the original implementation. Of course, with this I'll have to write more code.

Question: Should the [un]directed graph classes really allow the override of the underlying collections? If the intent of these classes is to get the users up and running, then perhaps those options could be seen as optimizations rather than algorithm design choices. Since these classes can act as drop-in replacements (and are in fact, implemented in terms of an adjacency list), it should be relatively easy to "upgrade" if the user really needs to change those options. Of course, changing the storage options actually affects properties like iterator and descriptor stability, but that's going to be a gotcha anyways.

Question: Should the [un]directed classes implement or provide checks to ensure that graphs are simple or contain no self-loops? How can this be managed? My gut feeling is that a good answer to this question might be a complete redesign of the graph concept structure. For exmaple, graph properties might be encapsulated in a struct of tags containing items like allow directedness, parallel edge support, self-loop support, etc. This is probably something to think about in the long-term.

Measures and Algorithms

I haven't given a lot of thought to these since the semester just ended. However, I am starting to consider various type requirements for the algorithms and have spent some time playing with conceptg++. I did notice however, that there is a ticket already exists for finding cliques within a graph (#693) - even better, with links to source code. It might be worth pointing out, that the Bron and Kerbosch algorithm is written for a Boolean matrix so interpreting it for an adjacency list is a bit of a chore. I haven't actually read the other algorithm yet, hopefully it's promising.

On a side note, I started codifying the Boost.Graph graph concepts into a concepts header just to see if I could make it work - I haven't succeeded yet. Still, it's a good excercise and makes you think about concept hierarchies, nested requirements and the like. Also, i don't like the keyword concept_map. Not the idea itself - the keyword. If I'm not mistaken, this keyword replaces model from earlier proposals. Alas...

Since I've been codigying graph concepts, I should point out that IncidenceGraph and EdgeListGraph both define source() and target() methods, so they're a little redundant. I think it would be appropriate to remove those methods from the requirements of EdgeListGraph since they don't really have anything to do with the listing or enumerating of edges. Just a thought...

Documentation and Other Niceties

It seems to me that the Boost.Graph documentation has gotten a little stale and needs a face-lift (i.e., rewrite to quickbook). There are also some older parts of the documentation that don't appear to align with current practice - especially those parts related to internal or bundled properties. If you read closely, you get the idea that bundled properties are the preferred technique, but they have yet to earn an entry into the table of contents.

Of course, the documentation isn't the only place where Boost.Graph doesn't align with the rest of Boost. For example, there is not boost::graph namespace. While this isn't exactly a tragedy for the ages, it would provide a clean separation from the rest of Boost. I am considering implementing my [un]directed_graphs in a boost::graph namespace.

Another advantage of adding the boost::graph namespace is that it provides a mechansim for doing some other interesting things like reordering the parameters of common Boost.Graph function calls. One of my biggest complaints when I started learning the library was that the graph was always passed as the last argument in these functions. This doesn't really mesh well with the common "C-style" object-oriented approach where the type of the first parameter generally indicates the type to which said function would belong as a method. Is it really that big of a deal? I don't know... but it would be pretty easy to reorder parameters likes this (for example):

namespace boost { 
  namespace graph {
    template <class C>
    inline std::pair<typename C::edge_descriptor, bool>
    add_edge(boost::directed_graph_helper<C>& g, typename C::vertex_descriptor u, C::vertex_descriptor v)
    {
      return boost::add_edge(u, v, g);
    }
  }
}

Just a thought... There's probably some reason why this might break that I haven't thought of. And yes... that's what the signature of the add_edge() function actually looks like for directed graphs.

Note: See TracWiki for help on using the wiki.