Opened 11 years ago
Last modified 9 years ago
#6706 new Tasks
Mersenne Twister discard() speed improvement — at Version 2
| Reported by: | Owned by: | No-Maintainer | |
|---|---|---|---|
| Milestone: | To Be Determined | Component: | random | 
| Version: | Boost 1.48.0 | Severity: | Optimization | 
| Keywords: | Cc: | 
Description (last modified by )
There is an algorithm for fast jump ahead for the MT:
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/JUMP/index.html
Some more references (thanks Topher!)
Hiroshi Haramoto, Makoto Matsumoto, and Pierre L'Ecuyer. 2008. A Fast Jump Ahead Algorithm for Linear Recurrences in a Polynomial Space. In /Proceedings of the 5th international conference on Sequences and Their Applications/ (SETA '08), Solomon W. Golomb, Matthew G. Parker, Alexander Pott, and Arne Winterhof (Eds.). Springer-Verlag, Berlin, Heidelberg, 290-298. DOI=10.1007/978-3-540-85912-3_26 http://dx.doi.org/10.1007/978-3-540-85912-3_26
An accessible online copy is at:
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/jump-seta-lfsr.pdf
If I understand it correctly, whether or not it is "constant time" is a bit ambiguous. Applying it to an arbitrary MT generator requires O(k1.6) where k is the size of the state, and k must be moderately large before this asymptote is approached. For a particular generator the k is a constant so it is "constant time" -- i.e., constant in the size of the jumps. So, for example, k for MT19937 is 19937 and so is k1.6 (about 7.5 million). The timing tests in the paper seem to indicate reasonable performance for common applications (e.g., where generating the jumps is done infrequently relative to the number of calls to the generators themselves).
Change History (2)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
| Description: | modified (diff) | 
|---|
Fixed in http://github.com/boostorg/random/commit/39d05f01614af0e8d46f97c3395b912fccf1fc79. The algorithm is still slow (20-50 ms/jump on my machine), but it starts to become faster than the naive algorithm for a jump size of about 10,000,000. I'm leaving this ticket open for now, because I believe that a similar algorithm can be used for several other generators. In particular, the lagged_fibonacci generator has a structure which should be suitable for this.


I'm working on implementing this, but the performance isn't really acceptable. The timings in the paper aren't really relevant because the execution time is dominated by the calculation of
g(t) = tJmod phi(t), which they precomputed as it only depends onJ, not on the state of the generator. The interface of discard doesn't really allowg(t)to be precomputed, so we have to take the cost every time.