wiki:SoC2013

Version 14 (modified by christopher_kormanyos, 10 years ago) ( diff )

Added Boost.Multiprecision radix-2 float back-end

Google Summer of Code 2013

Welcome to the Boost C++ Libraries' home page for Google Summer of Code (GSoC). This page provides information about student projects, proposal submission templates, advice on writing good proposals, and links to information on getting started writing with Boost.

This year Boost is looking to fund work on a number of different kinds of proposals:

  • toolkit-like extensions to existing libraries,
  • finishing or extending sandbox libraries,
  • new data structures and algorithms, and
  • multiple competing proposals for the same project.

For projects involving new or experimental libraries, the process of getting source code "Boost-branded" can take much longer than a single summer. In many cases, it can take much longer than a single year. Even if a library is accepted, there is an expectation that the original author will continue to maintain it. Building a library as part of Boost can easily entail a multi-year commitment. For this reason, we are willing to consider multi-year GSoC projects. However, prospective students must limit the scope of their work to a single summer. We may invite the most successful students to re-apply in 2014.

Requirements

Students must submit a proposal. A template for the proposal can be found here here. Hints for writing a good proposal can be found here.

We strongly suggest that students interested in developing a proposal for Boost discuss their ideas on the mailing list in order to help refine the requirements and goals. Students who actively discuss projects on the mailing list are also ranked before those that do not.

Projects

The following projects have been suggested by potential mentors. If the descriptions of these projects seem a little vague... Well, that's intentional. We are looking for students to develop requirements for their proposals by doing initial background research on the topic, and interacting with the community on the mailing list to help identify expectations.

Projects from previous years can be found here. There are still a number of interesting projects found in these pages.

Boost.uBLAS

First of all, we have a page with the list of future and desired new features here: http://sourceforge.net/apps/mediawiki/ublas/index.php?title=Main_Page. Students are encourage to consult this page.

Boost.uBLAS http://www.boost.org/doc/libs/1_53_0/libs/numeric/ublas/doc/index.htm is a fast implementation of linear algebra procedures, of for short it's a vector and matrix library. Actually it's indeed a vector AND matrix library and this is one of the main problem. Vectors are matrices, at least in standard math textbook, but not in Boost.uBLAS. They are represented as 2 separate classes and do not share code really. Not enough to be efficient. Moreover, vector being considered as fixed-sized vectors, they are not as versatile as STL vectors (but it's not the same concept) and not as accurate as a true linear algebra vector, that is they do not implement the notion of being row-vector or column-vector.

Said like that, it's not that important you would say, but by merging vector and matrix classes, we could share a lot of code and optimize even further. Second of all, by having a unified architecture we could start implementing modern acceleration techniques that lacks in Boost.uBLAS, like SIMD computations, multi-core, GPU, etc...

Our ideas for a GSOC project are the following:

  • unify representation of vector and matrices into a unique matrix class. Keep compatibility with previous code by providing a default vector<> representation that would inherit for matrix<>. Improve it too by adding more template parameters like size and orientation.
  • add a framework for automatic algorithm selection at compile-time (that's more academic),
  • add a framework for automatic algorithm selection at run-time (that's very academic, I'm thinking of matrix inversion etc...),
  • use this new architecture to propose implementation for the following:
    • fixed-sized vectors and matrices with optimization
    • a true * operator for vector/vector, vector/matrix and matrix/matrix multiplication
    • an architecture to choose at compile time the best algorithm to apply if several are available (very relevant to multiplication for example),
    • ideas and examples on how to implement SIMD and multicore operations,
    • use this new framework to implement particular matrices like Toeplitz, Hadamard, block matrix (see a taxonomy and possible applications here: http://en.wikipedia.org/wiki/List_of_matrices) and select algorithms that are more specific to each type in order to optimize code even further.

The last feature is highly desirable in Machine Learning, Big Data, Computer Vision, Data mining, etc...

Inspiration from other libraries like Eigen, Armadillo, GotoBLAS, etc... is highly recommended (after all, that's one of the raison d'etre of Free Software).

Mentor: David Bellot (david.bellot[at]gmail.com)

Boost.Math Bernoulli Numbers and Applications to Special Functions

Boost.Math http://www.boost.org/doc/libs/1_53_0/libs/math/doc/sf_and_dist/html/ is a large well-established Boost library, but new mathematical functions can always be added.

We wonder if students would be interested in adding Bernoulli numbers needed in several useful series.

Bernoulli numbers are rational numbers. Bernoulli numbers frequently arise in pure and applied mathematics in areas such as asymptotic approximations, Euler-Maclaurin summation and many others.

In this project, we will implement fast, accurate calculations of Bernoulli numbers for built-in types as well as multiprecision types. The Bernoulli numbers will subsequently be used for improving the existing implementation of gamma functions and adding brand new Boost.Math support for the polygamma function.

http://en.wikipedia.org/wiki/Bernoulli_number

http://mathworld.wolfram.com/BernoulliNumber.html

  • Add support for Bernoulli numbers along with thread-safe caching.
  • Add support for polygamma of positive integer order (requires Bernoulli numbers).
  • Improve tgamma/lgamma for multiprecision types based on Stirling's approx.
  • Optional: Add support for the Hurwitz zeta function.

This would clearly require some serious math skills, but also good knowledge of C++, especially using templates which Boost.Math makes much use of to provide not only built-in double and long double but recently multiprecision, large fixed and arbitrary precision. You will need to have experience of using Boost libraries, including Boost.Test, and using SVN and/or GIT.

If studying http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/gamma.hpp leaves you frightened, then this project is not for you.

If you would like to demonstrate your skills, you might like to try coding the naive Akiyama–Tanigawa algorithm for second Bernoulli numbers Bn http://en.wikipedia.org/wiki/Bernoulli_number#Algorithmic_description using Boost.Math. Extra marks for providing a Boost.Test comparing with a handful of published values. You can use any platform, Linux, Mac or Microsoft with your IDE of choice, perhaps Visual Studio or NetBeans.

The project will by mentored by Paul Bristow for administration and Boost infrastructure, and John Maddock and Christopher Kormanyos for mathematical and algorithmic expertise.

Boost.odeint

Boost.odeint http://www.odeint.com is a library devoted to finding numerical solutions of ordinary differential equations (ODEs). It is developed in a generic way using Template Metaprogramming which leads to extraordinary high flexibility at top performance. It is widely used in academic and industrial projects. Boost.odeint has been a Google Summer of Code project in 2011 http://google-melange.appspot.com/gsoc/project/google/gsoc2011/mulansky/14001 and we would like again to see students involved in the development of this numerical library. Therefore, we offer the two following projects:

Implicit Routines

The current focus of odeint is on explicit routines, namely the Runge-Kutta schemes or the multi-step methods. However, we would like to expand odeint by adding implicit routines in the same flexible manner. Implicit routines are important when dealing with stiff systems, but also ensure stability in some cases of discretized partial differential equations (PDEs). At the moment, odeint provides an implicit Euler and Rosenbrock algorithm, but the implementation is restricted to Boost.uBlas. As one of the main objectives of odeint is to provide highly flexible algorithms, we want to change this implementation to have a greater flexibility and interchangeability similar to that of the explicit routines. This project would not only require pure coding, but first also a considerable amount of design work.

  • Develop a design on how the requirements on implicit routines can be modularized
  • Implement a prototype showing the usability of the design
  • Change the existing routines to the new design
  • Implement other backends (MTL, eigen,...) to demonstrate the flexibility
  • Provide examples and documentation
  • (Add more implicit routines, if time permits)

This project does not only require profound knowledge on C++ and generic programming, but we would also like to see some experience on numerical algorithms and solving ODEs, if possible.

The project will be mentored by Karsten Ahnert and Mario Mulansky.

Parallelized Backends

The modularized design of odeint allows it to be used in a wide variety of environments by implementing appropriate computational backends. For example, there already exist such backends for running odeint on graphic cards, both for OpenCL and CUDA. However, in this project we would like to see the applicability further expanded in terms of parallelized computation backends based on OpenMP, MPI and maybe also SIMD. Some prototypical OpenMP backends have already been developed and used, but have not yet reached a satisfactory level to be integrated with the rest of odeint. This project would mainly consist of implementing computational backends for odeint, benchmarking them in performance tests and providing examples on how to use these backends efficiently, maybe in terms of a short tutorial. For this work, despite profound knowledge on C++, it would also be helpful to have some experience with numerical programming and parallelization.

  • Develop parallelized computational backends (OpenMP, MPI, SIMD)
  • Implement performance tests
  • Add examples and tutorial

This project will be mentored by Karsten Ahnert and Mario Mulansky

Boost.Multiprecision

Boost.Multiprecision http://www.boost.org/doc/libs/1_53_0/libs/multiprecision/doc/html/boost_multiprecision/intro.html/ is a new Boost library that offers multiple precision integer, rational and floating-point types with precision exceeding those of built-in float, double and long double. Boost.Multiprecision uses a uniform architecture that embodies the extended precision type with a front-end generic number template combined with a one of several back-end number types that grind out the nuts-and-bolts of the multiprecision work.

We are looking for a student to assist with writing a high-performance radix-2 floating-point back-end for Boost.Multiprecision. The current implementation uses radix-10 and suffers certain performance losses and lack of extensibility therefrom.

This is exciting and challenging task requires a high level of adeptness with C++ coding, mathematics, and algorithms. It combines high-performance with absolute error intolerance. As such, this project will hone the skills of the mathematical and algorithmic programmer and serve well as a research topic for students whose studies include algorithms.

Heavy use will be made of

  • C++
  • Templates
  • The STL
  • Memory management
  • Schoolbook and Karatsuba multiplication
  • Fast Fourier transforms
  • Conversion to and from radix-2 and radix-10

The reference "Modern Computer Arithmetic" is a valuable source for the algorithms in this project. A draft version of the book is available free of charge and the printed copy can be found in a library or at a book seller.

This project will be mentored by Christopher Kormanyos

Ideas

Here you find a lot of ideas which have been mentioned on the Boost mailing list. Boost developers can use these ideas to create new projects (and add them to the list of projects above). And students can use them as start points to propose projects themselves. While these ideas are by far not as concrete as projects in the list above, they give students an idea what else they can work on within Boost. They also give students a chance to propose exactly the kind of project they'd like to work on instead of picking one from the predefined projects from above.

Boost libraries (extending/overhauling/finishing)

Algorithms

  • Radix sort
  • Approximate string matching
  • Full text search
  • Near Duplicate Detection (shingling)
  • Parallel algorithms (sort, for_each)
  • Algorithms for gpgpu
  • Kinetic scrolling

Data strutures

  • Trie data structure
  • B-tree data structure
  • Concurrent containers (unordered_map, unordered_set, vector, forward_list)
  • Slim string (as opposed to the fat std::string interface, maybe immutable, maybe policy based)

Media

  • Hardware graphics (OpenGL/OpenGL ES/DirectX abstraction)
  • Audio library (OpenAL/FMOD/etc)
  • Video processing library (gil for video)

Games

  • Physics library (ODE/Havok/PhysX/etc abstraction)
  • Input library (like DirectInput/XInput)
  • Ranking algorithms (elo/TrueSkill)

Databases

Math

  • Fixed-point math library (binary scaling)
  • Geometry library (convince the developers to submit Eigen 3 to Boost)

Memory

  • Memcache library
  • Memswap algorithm

File formats

  • JSON parsing libary
  • XML library

Communication

GUI

  • GUI library

C++ "extensions"

Performance

  • Benchmark framework

Hardware

  • CPUID
Note: See TracWiki for help on using the wiki.