Changes between Initial Version and Version 1 of soc/2007/UserFriendlyGraphs/Indexing


Ignore:
Timestamp:
Jun 25, 2007, 1:08:39 AM (15 years ago)
Author:
Andrew Sutton
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • soc/2007/UserFriendlyGraphs/Indexing

    v1 v1  
     1= A Discussion of Vertex Indexing =
     2Since switching the default storage selectors from `vecS` to `listS`, I've noticed exactly how few examples there are that actually use this. What's actually troubling about this is the fact that `vecS` secretly provides a `vertex_index` property which, as far as I can tell is used by nearly every algorithm in the library. Not surprisingly, `listS` does not. This can cause a number of misconceptions, namely this [http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?Documentation_-_Graph_Library using vertex_descriptor as a 0-based index] post on the Boost wiki.
     3
     4Anyhow, the problem is not that using `listS` as a storage selector really breaks a lot of code (it can), but that its hard to figure out how to create vertex and edge index maps for them. The Dijkstra example has a `listS` example, but it uses property maps instead of bundled properties. Apparently, the solution is to manually initialize the index map. This is both good and bad. First, it's bad because new user's won't really understand why they have to write the same chunk of code over and over. Maybe it would be nice to provide a default `index_vertices()` function and an `index_edges()` functions that do that work for us. On the other hand, It's good because it highlights the instability of indices over descriptors.
     5
     6I think the real I'm having with this is that so much of the documentation and so many of the examples operate on the notion that a `vertex_descriptor` is its own index - which it is most definitely not.
     7
     8Since originally writing this, and continuing to work on the graph libraries, I have come to believe that the indexing required by most algorithms is more of a developmental crutch than a true feature. The sole purpose of requiring an index it mapping from a vertex to a property in constant time. There are certainly other, better ways to do this that don't require an artifical mapping of vertex descriptors to indices, and then to properties. One idea might be to declare external properties as `hash_map`s that map from vertices to properties (Look Ma! No index!). Of course, this is well beyond the scope of my summer project, so obviously, I won't be trying to fully address this problem.
     9
     10Instead, I am going to focus on solutions for just my `[un]directed` classes. The following is a proposal for semi-automated vertex management.
     11
     12== Semi-Automated Index Management ==
     13One of the most misleading aspects of the Boost.Graph library is that in many cases we treat vertex descriptors as vertex indices. Many of the examples simply assume that descriptor is an integral type (which in most examples it is) and that the descriptor value magically corresponds to the index into the vector into which the vertex is added (which in most cases it does). Unfortunately, this is quite misleading and can be trivially invalidated using a core algorithm in the following manner:
     14
     15{{{
     16#!cpp
     17vertex_descriptor u = add_vertex(g);
     18vertex_descriptor v = add_vertex(g);
     19remove_vertex(u, g);
     20breadth_first_search(g, v, make_bfs_visitor(null_visitor())); // CRASH!
     21}}}
     22
     23This will probably result in a segmentation fault because of the algorithm's use of exterior property maps. Specifically, this algorithm uses an externior ''color'' property declared similarly to:
     24
     25{{{
     26#!cpp
     27vector<color> colors(num_vertices(g));
     28}}}
     29
     30Unfortunately, the only remaining vertex has index 1, which means when the algorithm runs this line of code (for example):
     31
     32{{{
     33#!cpp
     34if(colors[get(vertex_index, g, v)] == BLACK) ...
     35}}}
     36
     37Oops. That index isn't in the bounds of the array, and so this may result in a crashed program. Somewhat ironically, this problem is trivially solved by simple ''renumbering'' the vertex indices before calling the algorithm.
     38
     39== The Vertex Indexing Suite ==
     40Because of the near-ubiquitous approach of accessing exterior properties via vector indices, it seems logical that the Boost.Graph Library provide facilities for making this easier. This set proposed set of features addresses three common indexing problems:
     41
     42# The fact that `undirected_graph`s, `directed_graph`s, and `adjacency_list`s using `listS` as a vertex storage selector do not have vertex indices by default - which means that there are precious few algorithms these classes will work with out of the box. The current solution is to manually define and manage these property maps.
     43# The easy invalidation of indices on destructive graph operations (i.e, removing a vertex). It should be easy to reset the indices of a graph to a continuous integral range after a number of remove_vertex() operations.
     44# The inability to correctly identify bounds on exterior property map for a graph that has had vertices removed (without first renumbering the indices).
     45
     46The first part of the proposal requires a change to the `undirected_graph` and `directed_graph` class. Specifically, that they provide a default property map for vertex indices. This functionality can be propogated to the `adjancency_list` class (with `listS` vertex storage) at a later time.
     47
     48There are also three new graph functions in the proposed suite. These functions all require their graph parameters to be a model of PropertyGraph, with a a property map for the `vertex_index` as part of the graph itself.
     49
     50`get_vertex_index(v, g)`
     51  This is an alias for `get(vertex_index, g, v)` and provided for convenience rather than added functionality.
     52
     53`max_vertex_index(g)`
     54  This returns the total number of indices allocated for the vertices of the graph ''g''. The last vertex added to a graph will always have the index `max_vertex_index(g) - 1`. The return type for this function is the value type of the `vertex_index` property map, which is typically an integer type.
     55
     56`renumber_vertex_indices(g)`
     57  Renumber the indices in the vertex index map of ''g'' such that the index of each vertex is in the range [0, `num_vertices(g)`). Also, after calling this function, `max_vertex_index(g)` == `num_vertices(g`).
     58
     59Implementing these three functions will allow more flexible algorithms for mutable graphs. Specifically, this model relaxes many of the constraints that vertices have indices in the range [0, num_vertices(g)). Instead, algorithms can require indices in the range \[0, max_vertex_index(g)).
     60
     61As a downside, this relaxed constraint will result in over-sized vectors whose items may not correspond to vertices within the graph. However, if the algorithms employ vectors as exterior properties, there is probably a good chance that the difference in vector size and capacity will subsume space required by overallocating for missing vertices. Furthermore, correct use of the `get_vertex_index(v,g)` function will promote more correct accessing of said vectors. Since all valid vertex descriptors have valid and correct indices, there is no possibility of accessing an element in an exterior property that does not correspond to a valid vertex. As an added bonus, the explicit name should improve readability significantly and discourage new (and experienced?) developers from doing this:
     62
     63{{{
     64#!cpp
     65vertex_descriptor v = add_vertex(g);
     66colors[v] = ...
     67}}}
     68
     69which, for non-trivial programs is in fact, incorrect.
     70
     71This proposal is implemented in this form for the `undirected_graph` and `directed_graph` classes. Comments are welcome.