Opened 14 years ago

Closed 13 years ago

Last modified 9 years ago

#1980 closed Feature Requests (wontfix)

[Boost Graph library] functionality to reserve memory

Reported by: marc.albers@… Owned by: Jeremiah Willcock
Milestone: To Be Determined Component: graph
Version: Boost 1.35.0 Severity: Optimization
Keywords: Boost Graph library, std::vector, reserve memory Cc:

Description

Hi,

I would like to propose a feature for the Boost Graph library.

Summary:

For large std::vector containers reserving memory in advance can improve performance significantly if many objects are added. I'd like to have this option for the boost::adjacency_list when using selector type vecS for storage of vertices and edges. This option would help to avoid frequent reallocations when using add_vertex() and add_edge().

My Problem:

I am using the Boost Graph library for a large-scale railway crew scheduling problem. The graph is defined as boost::adjecency_list with vecS as storage selection for vertices and edges, i.e., std::vector is used. For some problem instances, the graph can have 35,000 vertices and 4,000,000 edges. The vertex descriptors have a size of up to 40 bytes and edge descriptors have a size of up to 100 bytes. The graph is created using the add_vertex() and add_edge() methods. Creating this graph requires several hours on my Pentium M 1.7 GHz which seems too long.

My Solution:

The file \boost\graph\detail\adjecency_list.hpp is modified as follows. Class vec_adj_list_impl gets 3 new methods:

inline void v_reserve(const vertices_size_type& nv) {

m_vertices.reserve(nv);

}

inline void oe_reserve(vertex_descriptor& v, const edges_size_type& ne) {

m_vertices[v].m_out_edges.reserve(ne);

}

inline void ie_reserve(vertex_descriptor& v, const edges_size_type& ne) {

m_vertices[v].m_in_edges.reserve(ne);

}

Using these methods, the time to create a large-scale graph drops to minutes (factor >10). I also modified the method copy_impl to speed-up copying.

I attached my modification in a file.

Comment:

These methods are only a workaround. Since the documentation of add_vertex() and add_edge() mentions the problem of reallocations, I suspect that the authors have already thought about reserving memory, but for some reasons did not implement it. As my example shows, there are situations where reserving memory is very useful. This feature is desirable for any user working with large-scale graphs that require an implementation with std::vector.

Marc

Change History (5)

comment:1 by marc.albers@…, 14 years ago

Sorry, I cannot attach the file: "Akismet says content is spam". Anyway, I think my point is clear.

comment:2 by Jeremiah Willcock, 13 years ago

Owner: changed from Douglas Gregor to Jeremiah Willcock
Status: newassigned

Do you have the edges of your graph in advance? Are you mutating the structure of the graph? If you have the entire structure before creating the graph, and will not change it afterwards, please use compressed_sparse_row_graph instead of adjacency_list.

comment:3 by Jeremiah Willcock, 13 years ago

Resolution: wontfix
Status: assignedclosed

Closing because submitter did not respond. The CSR graph is a better choice when the size of the graph is known in advance.

comment:4 by Antony Polukhin, 11 years ago

Those functions are not really required, because m_vertices is public. So you can just call graph.m_vertices.resize(35000);

It gives a really good performance boost (in one of my projects gave 30% boost), but you shall know average vertices count.

in reply to:  4 comment:5 by blkohn@…, 9 years ago

Replying to apolukhin:

Those functions are not really required, because m_vertices is public. So you can just call graph.m_vertices.resize(35000);

That seems a bit lazy. m_vertices should not be part of the public interface, and clearly we would like to be able to reserve sizes on these at any time. Why not just add this functionality to the interface? Simply exposing m_vertices allows me to do all manner of things which may leave the graph in an inconsistent state.

Note: See TracTickets for help on using tickets.