id summary reporter owner description type status milestone component version severity resolution keywords cc 1059 random_on_sphere can be made 10 times faster! cschrett@… "Computing Random Points on Sphere Colas Schretter cschrett@ulb.ac.be Monte-Carlo simulations of the photons transport require to pick, very quickly, random points on the surface of a unit sphere. I tried four ways to randomly generate unit direction vectors in three dimensions. Naive method A point is picked randomly in the unit cube with a uniform random generator. If this point lies inside the unit sphere, it is projected on the surface of the sphere by normalizing its coordinates. If the point do not lies inside the unit sphere, then we iterate this procedure. Since the volume of the unit sphere is \pi/6, the average number of iterations is equal to 1.909. Trig method Any point on the surface of a unit sphere can be represented by two coordinates. The trig method pick randomly the z\in\left[-1,+1\right[ coordinate to select a slice of the sphere then a point is uniformly selected on this circle by randomly picking an azimuthal angle \phi\in\left[0,2\pi\right[. Note that the random generator is only called twice, and no trial-error loop is needed. Hybrid naive and trig method A point is picked randomly in the unit disk with a trial-error loop. Since the area of the unit disk is \pi/4, the average number of iterations is equal to 1.2732. Hence, the probability of rejection is lower than in three dimensions. The third coordinate is deduced by using the fact that the coordinates of a point on the unit sphere ensure x+y+z=1. Finally the x and y coordinates are scaled according to the value of z. Using the uniform_on_sphere distribution of Boost The random_on_sphere distribution of the Boost library uses Normal distributions to fill an STL container with coordinates. The procedure work for any dimensions and it is always more elegant to use trusted third party libraries instead of coding yourself. Benchmarks Ten million of three dimensional random vectors have been generated on various machines and compilers, using the four methods mentioned above. The GNU gcc 4.0.1 compiler was used on unices with and without optimization flags (-O6). The Visual C++ 2005 compiler was used on a Windows platform with disabled (/Od) and full (/Ox) optimizations. Results are shown in Table [cap:Benchmarks] for the three machines that has been tested: * Apple Powerbook 12, 1.33 GHz (IBM G4 1.33 GHz) * Dell latitude D620 laptop with Intel T2300, 1.66 GHz (Intel T2300 1.66 GHz) * HP XC Cluster Platform 4000 with AMD Opteron, 2.4 GHz (AMD Opteron 2.4 GHz) +-------------------------+----------+----------+-----------+---------+ | CPU | Naive | Trig | Hybrid | Boost | +-------------------------+----------+----------+-----------+---------+ +---------------------------------------------------------------------+ | Without optimization | +-------------------------+----------+----------+-----------+---------+ | IBM G4 1.33 GHz | 18.43 | 11.35 | 9.09 | 80.88 | +-------------------------+----------+----------+-----------+---------+ | Intel T2300 1.66 GHz | 4.58 | 3.87 | 2.81 | 23.60 | +-------------------------+----------+----------+-----------+---------+ | AMD Opteron 2.4 GHz | 3.91 | 2.94 | 2.00 | 15.70 | +-------------------------+----------+----------+-----------+---------+ +---------------------------------------------------------------------+ | With optimizations | +---------------------------------------------------------------------+ | IBM G4 1.33 GHz | 4.33 | 5.17 | 2.8 | 17.30 | +-------------------------+----------+----------+-----------+---------+ | Intel T2300 1.66 GHz | 2.06 | 2.39 | 1.14 | 9.36 | +-------------------------+----------+----------+-----------+---------+ | AMD Opteron 2.4 GHz | 1.06 | 1.46 | 0.57 | 5.54 | +-------------------------+----------+----------+-----------+---------+ Benchmarks results in seconds with and without compiler optimizations. Conclusions When I do not switch on optimization flags of my compiler, then the timings confirm the results of Ken Sloan, referenced in the FAQ of comp.graphics.algorithms. However, with optimization, the naive method is 20% faster than the trig method. In any cases, the hybrid naive and trig method is the winner. Similar tests have been conducted for the two-dimensional case. Using polar coordinates, picking a point on a disc only require one random toss and two call to sin/cos trigonometric functions. I observed that the performance of the naive rejection method also outperforms the trigonometric calculations in two dimensions. It was interesting to experimentally determine the threshold where the naive method begins to perform worse than the current implementation of Boost because of the curse of dimensionality. In four and five dimensions, the naive method significantly outperforms the current implementation based on normal distributions. However, for six and more dimensions, the Boost implementation is significantly faster. In fact in five dimensions, the naive method is one third faster than Boost but is one third slower in six dimensions. The benchmarks should be run on more computer architectures and compilers to confirm my conclusions. For example, the Intel C++ compiler is often used nowadays... Proposal to update the Boost implementation I think that it is mandatory to improve the performances of the Boost implementation. I suggest to convert the dimension parameter to a template argument, the declaration syntax becomes uniform_on_sphere<3> sphere; instead of the current syntax: uniform_on_sphere<> sphere(3); Then, this is possible to specialize template implementations for the two- and three-dimensional cases that are so common. The 4 and 5 dimensional cases can also be implemented with the naive method. The new syntax is also more semantic and elegant. The current interface requires that the container have to provide STL-like iterators. I suggest to implement the algorithms by using [ ] accessors instead. This will allow to assign valarrays or custom-made vector classes that are popular in implementations of linear algebra libraries. Source Code // Benchmark of several methods to pick randomly points on the unit sphere // Colas Schretter 2006 #include #include using namespace std; #include #include using namespace boost; #define PI 3.1415926535897932385 typedef lagged_fibonacci607 generator_type; const int nbr_samples = 10000000; struct Vector { float x,y,z; Vector() {} Vector(const float x, const float y, const float z): x(x), y(y), z(z) {} float length2() const { return x*x + y*y + z*z; } Vector& normalize() { const float inverse = 1.0 / sqrt(length2()); x *= inverse, y *= inverse, z *= inverse; return *this; } }; inline Vector naive(uniform_01& random) { Vector direction; do { direction.x = 2 * random() - 1, direction.y = 2 * random() - 1, direction.z = 2 * random() - 1; } while(direction.length2() > 1); return direction.normalize(); } inline Vector trig(uniform_01& random) { const float phi = 2 * PI * random(); const float cos_theta = 2 * random() - 1; const float sin_theta = sqrt(1 - cos_theta * cos_theta); return Vector(cos_theta, sin_theta * cos(phi), sin_theta * sin(phi)); } inline Vector hybrid_naive_and_trig(uniform_01& random) { Vector direction; float s; do { direction.x = 2 * random() - 1, direction.y = 2 * random() - 1, s = direction.x * direction.x + direction.y * direction.y; } while(s > 1); const float r = 2 * sqrt(1 - s); direction.x *= r; direction.y *= r; direction.z = 2 * s - 1; return direction; } int main(int argc, char** argv) { cout << ""Random on Sphere Benchmark by Colas Schretter 2006"" << endl; generator_type generator(42); uniform_01 random(generator); uniform_on_sphere<> sphere(3); variate_generator > random_on_sphere(generator, sphere); float s = 0; // dummy result timer t; t.restart(); for(int i = 0; i < nbr_samples; ++i) { const Vector direction = naive(random); // do something with the computed values s += direction.x + direction.y + direction.z; } cout << ""naive method: "" << t.elapsed() << "" sec."" << endl; t.restart(); for(int i = 0; i < nbr_samples; ++i) { const Vector direction = trig(random); // do something with the computed values s += direction.x + direction.y + direction.z; } cout << ""trig method: "" << t.elapsed() << "" sec."" << endl; t.restart(); for(int i = 0; i < nbr_samples; ++i) { const Vector direction = hybrid_naive_and_trig(random); // do something with the computed values s += direction.x + direction.y + direction.z; } cout << ""hybrid naive and trig method: "" << t.elapsed() << "" sec."" << endl; t.restart(); for(int i = 0; i < nbr_samples; ++i) { const vector random_direction = random_on_sphere(); const Vector direction(random_direction[0], random_direction[1], random_direction[2]); // do something with the computed values s += direction.x + direction.y + direction.z; } cout << ""boost random_on_sphere: "" << t.elapsed() << "" sec."" << endl; // use the dummy result to be sure that the compiler optimizations do not skip computations cout << ""dummy result = "" << s << endl; return EXIT_SUCCESS; }" Feature Requests closed To Be Determined random Optimization fixed Jens.Maurer@…