| 1 | == Problems with Properties == |
| 2 | Much related to the vertex indexing issues, is that of building exterior property maps for custom graph types. Granted, we can basically avoid all external properties by simply forcing our graphs to use interior bundled properties, but there are some problems with that. First, there's tons of overhead for properties that may only need to be used once. Second, we have to change the graph structure every time we need a new property map. Clearly, this isn't a good solution. |
| 3 | |
| 4 | Actually, it's trivial to build a property map if your storage component is selected as `vecS`. Why? Well... it's a kind of involved answer, but basically, with vectors, the vertex descriptor becomes an index into an exterior vector. That index acts as the mapping between the vertex and the property stored in the exterior vector. Like this: |
| 5 | |
| 6 | {{{ |
| 7 | #!cpp |
| 8 | adjacency_list<vecS, vecS> g; |
| 9 | vertex_descriptor verts[5]; |
| 10 | for(int i = 0; i < 5; ++i) add_vertex(g); |
| 11 | |
| 12 | vector<color> colors(num_vertices(g)); |
| 13 | colors[verts[3]] = blue; |
| 14 | }}} |
| 15 | |
| 16 | I see two serious problems with this snippet, but unfortunately, its the basic template followed by ''nearly every example in the library''. The first, and most fatal problem is the assumption that the vertex descriptor returned by `add_vertex()` acts as its own index - this is patently untrue (see the [wiki:soc/2007/UserFriendlyGraphs/Indexing discussion on vertex indexing] for a more complete description of the problem. The second problem is that this mapping mechanism is non-portable for any graph using non-vector storage. Why? Well, because if we declared the list like this: `adjacency_list<vecS, listS>` then the vertex_descriptor type is actually `void*`. Even if a vector allowed indexing with pointer types, the integer-values are probably guaranteed to be well out of the bounds of the array. |
| 17 | |
| 18 | You'll also see exterior property maps passed to algorithms like this: |
| 19 | |
| 20 | {{{ |
| 21 | #!cpp |
| 22 | vector<int> comps(num_vertices(g)); |
| 23 | connected_components(g, &comps[0]); |
| 24 | }}} |
| 25 | |
| 26 | What's up with taking the address of the first element of the vector? It's actually a trick. Somebody, somewhere specialized a type of property map that operates on pointer types. From the implementation standpoint, imagine that the property actually has this floating around: |
| 27 | |
| 28 | {{{ |
| 29 | #!cpp |
| 30 | Type* m_array; // here, Type == int (the type of the vector) |
| 31 | }}} |
| 32 | |
| 33 | Which means that indexing works like this (more or less) - i'm abbreviating a fair bit. |
| 34 | |
| 35 | {{{ |
| 36 | #!cpp |
| 37 | Type |
| 38 | property_map<...>::get(Key k) // Key == vertex_descriptor |
| 39 | { |
| 40 | return array[k]; |
| 41 | } |
| 42 | }}} |
| 43 | |
| 44 | Because `vertex_descriptors` are integers (pretending to be their own indices), the property map simply returns the vertex's offset in the array given by the vector. While this works great, and is probably the best constant-time property access avaialble, it makes it really hard to figure out what's actually going on. In short, it has poor readability. Moreover, it's makes it kind of hard to figure out how to port this to graphs non-vector storage (like the `[un]directed_graph` classes). It's also somewhat fragile since it requires that vertex indices actually exist in the range [0, `num_vertices(g)`). |
| 45 | |
| 46 | == Truly Associative Property Maps == |
| 47 | Generally speaking property maps require vertex and edge descriptors as keys for lookup. In order to get past some of these problems, we just need to figure out how to abstract the lookup of values based on edge and vertex descriptors. In short, we need to have our property maps act as associative maps. We could probably use a standard map, but unfortunately, that has log-time lookup, which while good, is too slow for algorithms that require accesses to lots of different properties. Non-constant property access is not a ''Good Thing''. An alternative is to use the `unordered_map` from the TR1 spec - which unfortunately, Boost doesn't currently implement. |
| 48 | |
| 49 | If your STL provider, does implement `unordered_map` you're in good shape. Here's how we can rewrite some of the code above using an `unordered_map`. |
| 50 | |
| 51 | {{{ |
| 52 | #!cpp |
| 53 | typedef unordered_graph<> graph_type; |
| 54 | typedef tr1::unordered_map<vertex_descriptor, color> vertex_color_map; |
| 55 | typedef associative_property_map<vertex_color_map> color_map; |
| 56 | |
| 57 | graph_type g; |
| 58 | |
| 59 | vertex_color_map colors_map(num_vertices(g)); |
| 60 | color_map colors(cmap); |
| 61 | |
| 62 | connected_components(g, colors); |
| 63 | }}} |
| 64 | |
| 65 | It's not quite as pretty as the above code, but it's alot more functional - and bestof all, it actually works! This has no dependence whatsoever on vertex indices. Note that its probably a good idea to initialze the `unordered_map` with the number of vertices since growing a hashtable can require lots of re-hashing. Also, we don't really need to in pre-hash the graph since accesses to un-hashed vertices can be automatically added to the map (although I'm not sure what values will default to). More explicit initializaiton should probably be done in a `for` loop. |
| 66 | |
| 67 | Unfortunately, this may not be the case for many algorithms. |
| 68 | |
| 69 | == Problems with Edges == |
| 70 | It turns out that the actual type of an edge can be (for listS): |
| 71 | |
| 72 | {{{ |
| 73 | #!cpp |
| 74 | boost::detail::edge_desc_impl<boost::undirected_tag, void*> |
| 75 | }}} |
| 76 | |
| 77 | I'm going to guess that there isn't a built-in hash for that type, so this won't work with exterior edge properties. At least not until you provide a hash function for it. Here's the hash function for list-based vertex storage (the edge type depends on the vertex type): |
| 78 | |
| 79 | {{{ |
| 80 | #!cpp |
| 81 | namespace std |
| 82 | { |
| 83 | namespace tr1 |
| 84 | { |
| 85 | template <> |
| 86 | struct hash<boost::detail::edge_desc_impl<boost::undirected_tag, void*> > |
| 87 | : public std::unary_function<boost::detail::edge_desc_impl<boost::undirected_tag, void*>, std::size_t> |
| 88 | { |
| 89 | typedef boost::detail::edge_desc_impl<boost::undirected_tag, void*> Edge; |
| 90 | std::size_t operator ()(Edge e) |
| 91 | { |
| 92 | std::tr1::hash<Edge::property_type*> hasher; |
| 93 | return hasher(e.get_property()); |
| 94 | } |
| 95 | }; |
| 96 | } |
| 97 | } |
| 98 | }}} |
| 99 | |
| 100 | Obviously, there would need to be some adjustments since we can build the hash for either a native tr1 implementation or for Boost.Hash. Also, we might need to duplicate this code to handle non-list-based vector storage. I think this just means changing the `void*` to `unsigned` or `int` for vectors. The hash function works by returning the hash value of the property pointer on the edge. This is actually typedef'd to be `void*` so this should work fine. Note that we can't build the hash value from the source and target vertices of the edge because there can be multiple edges for those vertices. |