wiki:soc/2007/UserFriendlyGraphs/PropertyMaps

Problems with Properties

Much related to the vertex indexing issues, is that of building exterior property maps for custom graph types. Granted, we can basically avoid all external properties by simply forcing our graphs to use interior bundled properties, but there are some problems with that. First, there's tons of overhead for properties that may only need to be used once. Second, we have to change the graph structure every time we need a new property map. Clearly, this isn't a good solution.

Actually, it's trivial to build a property map if your storage component is selected as vecS. Why? Well... it's a kind of involved answer, but basically, with vectors, the vertex descriptor becomes an index into an exterior vector. That index acts as the mapping between the vertex and the property stored in the exterior vector. Like this:

adjacency_list<vecS, vecS> g;
vertex_descriptor verts[5];
for(int i = 0; i < 5; ++i) add_vertex(g);

vector<color> colors(num_vertices(g));
colors[verts[3]] = blue;

I see two serious problems with this snippet, but unfortunately, its the basic template followed by nearly every example in the library. The first, and most fatal problem is the assumption that the vertex descriptor returned by add_vertex() acts as its own index - this is patently untrue (see the discussion on vertex indexing for a more complete description of the problem. The second problem is that this mapping mechanism is non-portable for any graph using non-vector storage. Why? Well, because if we declared the list like this: adjacency_list<vecS, listS> then the vertex_descriptor type is actually void*. Even if a vector allowed indexing with pointer types, the integer-values are probably guaranteed to be well out of the bounds of the array.

You'll also see exterior property maps passed to algorithms like this:

vector<int> comps(num_vertices(g));
connected_components(g, &comps[0]);

What's up with taking the address of the first element of the vector? It's actually a trick. Somebody, somewhere specialized a type of property map that operates on pointer types. From the implementation standpoint, imagine that the property actually has this floating around:

Type* m_array; // here, Type == int (the type of the vector)

Which means that indexing works like this (more or less) - i'm abbreviating a fair bit.

Type
property_map<...>::get(Key k) //  Key == vertex_descriptor
{
  return array[k];
}

Because vertex_descriptors are integers (pretending to be their own indices), the property map simply returns the vertex's offset in the array given by the vector. While this works great, and is probably the best constant-time property access avaialble, it makes it really hard to figure out what's actually going on. In short, it has poor readability. Moreover, it's makes it kind of hard to figure out how to port this to graphs non-vector storage (like the [un]directed_graph classes). It's also somewhat fragile since it requires that vertex indices actually exist in the range [0, num_vertices(g)).

Explicit Implict Property Maps

Alot of the above property map stuff is hidden in the details. However, close inspection of the libraries shows a set of functions that are probably responsible for adapting that code to the property map interfaces. It's found in properties.hpp, and all functions have the name: make_iterator_vertex_map (with different parameters, of course). So, it might be a better idea to use these functions explicitly in the documentation. Calls could be of the form:

vector<int> comps;
connected_components(g, make_container_vertex_map(comps));

Or something like that... At least its a little more explicit. Also, it's kind of hard to see what type these maps are - which is why they're almost always passed as functions and typedef'd.

The "most" precise way to actually implement these is as follows. First, we need a type that actually performs the mapping - more or less. Essentially, this is a container that overloads the operator[] operator and the index is either an edge or vertex descriptor and the value is... well, dependant upon the mapping. This is the primary mapping object for the vertex/edge-to-value object. The important thing about this type is that it can add overhead to algorithms if the mapping is poorly chosen. For example, std::map is probably a poor choice because its lookup (mapping) is logarithmic (not constant).

If we're talking about the typical vector-based vertex storage, then this is what we might do (for a distance map, per-se).

typedef std::vector<int> DistanceMapping;

Here, the int refers to the distance type and not the key type (descriptor). Because it's a vector, the key type has to be an integer type (probably unsigned). This works out great for vector-based vertex storage since descriptors magically happen to be unsigned integers. Its even better since the value of those descriptors magically align to the range [0, num_vertices(g)) so the mapping is trivial, and very, very fast (probably the fastest you can get).

The second thing we really need is an adaptor between the mapping type and the Boost.PropertyMap library. Fortunately, there are already a bunch of these - for this application, we're looking at an iterator_property_map. This is an iterator property map because the indices of vector can also act as iterators to the container (as pinters). The only thing we have to do is define the types responsible for the actual mapping and adapt that to the library. This is done with:

typedef iterator_property_map<DistanceMapping::iterator,
                              property_map<Graph, vertex_index_t>::type> DistanceMap;

That's plenty ugly, but it's actually correct. The first parameter says that the adapted property map uses the mapping objects iterator type for indexing values. Remember that this iterator type is just an unsigned integer - the same as the vertex_descriptor type. The second parameter actually tells the adapter how to translate from a vertex_descriptor type to a the distance iterator. This parameter is actually defined in terms of the graph for which the map is being defined. Although we don't really see it here, this entire declaration is basically defining a set of functions that does this:

int                            // the distance type (value)
get(const DistanceMaping& map, // the underlying mapping object (the vector)
    unsigned v)                // the vertex descriptor of vector-based vertex storage
{
   return map[v];
}

This is fundamentally what happens. The actual series of type transformations and function instantiation is a lot more complicated, but at the heart of the operation is a constant-time vector access. Unfortunately, if your descriptor isn't an unsigned integer in the range [0, num_vertices(g)), you'll have to read the next section.

It might also be worth pointing out that when you see code that looks like this:

dijkstra_shortest_paths(g, v, distance_map(&dist[0]));

that distance_map(&dists[0]) is actually building an iterator_property_map just like before. It just so happens that &dists[0] is a valid iterator type for a vector so the construction of the property map is completely hidden.

Truly Associative Property Maps

Generally speaking property maps require vertex and edge descriptors as keys for lookup. In order to get past some of these problems, we just need to figure out how to abstract the lookup of values based on edge and vertex descriptors. In short, we need to have our property maps act as associative maps. We could probably use a standard map, but unfortunately, that has log-time lookup, which while good, is too slow for algorithms that require accesses to lots of different properties. Non-constant property access is not a Good Thing. An alternative is to use the unordered_map from the TR1 spec - which unfortunately, Boost doesn't currently implement.

If your STL provider, does implement unordered_map you're in good shape. Here's how we can rewrite some of the code above using an unordered_map.

typedef unordered_graph<> graph_type;
typedef tr1::unordered_map<vertex_descriptor, color> vertex_color_map;
typedef associative_property_map<vertex_color_map> color_map;

graph_type g;

vertex_color_map colors_map(num_vertices(g));
color_map colors(cmap);

connected_components(g, colors);

It's not quite as pretty as the above code, but it's alot more functional - and bestof all, it actually works! This has no dependence whatsoever on vertex indices. Note that its probably a good idea to initialze the unordered_map with the number of vertices since growing a hashtable can require lots of re-hashing. Also, we don't really need to in pre-hash the graph since accesses to un-hashed vertices can be automatically added to the map (although I'm not sure what values will default to). More explicit initializaiton should probably be done in a for loop.

Unfortunately, this may not be the case for many algorithms.

Problems with Edges

It turns out that the actual type of an edge can be (for listS):

boost::detail::edge_desc_impl<boost::undirected_tag, void*>

I'm going to guess that there isn't a built-in hash for that type, so this won't work with exterior edge properties. At least not until you provide a hash function for it. Here's the hash function for list-based vertex storage (the edge type depends on the vertex type):

namespace std
{
    namespace tr1
    {
        template <>
        struct hash<boost::detail::edge_desc_impl<boost::undirected_tag, void*> >
            : public std::unary_function<boost::detail::edge_desc_impl<boost::undirected_tag, void*>, std::size_t>
        {
            typedef boost::detail::edge_desc_impl<boost::undirected_tag, void*> Edge;
            std::size_t operator ()(Edge e)
            {
                std::tr1::hash<Edge::property_type*> hasher;
                return hasher(e.get_property());
            }
        };
    }
}

Obviously, there would need to be some adjustments since we can build the hash for either a native tr1 implementation or for Boost.Hash. Also, we might need to duplicate this code to handle non-list-based vector storage. I think this just means changing the void* to unsigned or int for vectors. The hash function works by returning the hash value of the property pointer on the edge. This is actually typedef'd to be void* so this should work fine. Note that we can't build the hash value from the source and target vertices of the edge because there can be multiple edges for those vertices.

Helper Functions

It would be nice if there were a suite of functions and types that would help programmers declare exterior property maps. I think that these would essentially default to either vectors or unordered maps depending on how the graph is defined. It might be helpful to have something like:

typedef exterior_vertex_property<Graph, float>::property_type DistanceValues;
typedef exterior_vertrex_property<Graph, float>::mapping_type DistanceMap;

DistanceValues dists(num_vertices(g));
DistanceMap dist_map(dists);

Essentially this template is specialized for vector- and non-vector-based storage such that the proeprty type is vectors for vector-based and unordered_maps for non-vector based. The mapping type is either an associative property map or an iterator property map.

An alternative approach migth be to build a single type that exposes references to the values and the mapping as inline functions:

typedef exterior_vertex_property<Graph, float> DistanceProperty;

DistanceProperty dists(num_vertices(g));
dijkstra_shortest_paths(g, v, distance_map(dists.map()));

Again... maybe.

Last modified 15 years ago Last modified on Jul 6, 2007, 3:36:33 PM
Note: See TracWiki for help on using the wiki.