wiki:SoC2013

Google Summer of Code 2013

Welcome to the Boost C++ Libraries' home page for Google Summer of Code (GSoC). This page provides information about student projects, proposal submission templates, advice on writing good proposals, and links to information on getting started writing with Boost.

This year Boost is looking to fund work on a number of different kinds of proposals:

  • toolkit-like extensions to existing libraries,
  • finishing or extending sandbox libraries,
  • new data structures and algorithms, and
  • multiple competing proposals for the same project.

For projects involving new or experimental libraries, the process of getting source code "Boost-branded" can take much longer than a single summer. In many cases, it can take much longer than a single year. Even if a library is accepted, there is an expectation that the original author will continue to maintain it. Building a library as part of Boost can easily entail a multi-year commitment. For this reason, we are willing to consider multi-year GSoC projects. However, prospective students must limit the scope of their work to a single summer. We may invite the most successful students to re-apply in 2014.

Requirements

Students must submit a proposal. A template for the proposal can be found here here. Hints for writing a good proposal can be found here.

We strongly suggest that students interested in developing a proposal for Boost discuss their ideas on the mailing list in order to help refine the requirements and goals. Students who actively discuss projects on the mailing list are also ranked before those that do not.

Projects

The following projects have been suggested by potential mentors. If the descriptions of these projects seem a little vague... Well, that's intentional. We are looking for students to develop requirements for their proposals by doing initial background research on the topic, and interacting with the community on the mailing list to help identify expectations.

Projects from previous years can be found here. There are still a number of interesting projects found in these pages.

Boost.uBLAS

First of all, we have a page with the list of future and desired new features here: http://sourceforge.net/apps/mediawiki/ublas/index.php?title=Main_Page. Students are encourage to consult this page.

Boost.uBLAS http://www.boost.org/doc/libs/1_53_0/libs/numeric/ublas/doc/index.htm is a fast implementation of linear algebra procedures, of for short it's a vector and matrix library. Actually it's indeed a vector AND matrix library and this is one of the main problem. Vectors are matrices, at least in standard math textbook, but not in Boost.uBLAS. They are represented as 2 separate classes and do not share code really. Not enough to be efficient. Moreover, vector being considered as fixed-sized vectors, they are not as versatile as STL vectors (but it's not the same concept) and not as accurate as a true linear algebra vector, that is they do not implement the notion of being row-vector or column-vector.

Said like that, it's not that important you would say, but by merging vector and matrix classes, we could share a lot of code and optimize even further. Second of all, by having a unified architecture we could start implementing modern acceleration techniques that lacks in Boost.uBLAS, like SIMD computations, multi-core, GPU, etc...

Our ideas for a GSOC project are the following:

  • unify representation of vector and matrices into a unique matrix class. Keep compatibility with previous code by providing a default vector<> representation that would inherit for matrix<>. Improve it too by adding more template parameters like size and orientation.
  • add a framework for automatic algorithm selection at compile-time (that's more academic),
  • add a framework for automatic algorithm selection at run-time (that's very academic, I'm thinking of matrix inversion etc...),
  • use this new architecture to propose implementation for the following:
    • fixed-sized vectors and matrices with optimization
    • a true * operator for vector/vector, vector/matrix and matrix/matrix multiplication
    • an architecture to choose at compile time the best algorithm to apply if several are available (very relevant to multiplication for example),
    • ideas and examples on how to implement SIMD and multicore operations,
    • use this new framework to implement particular matrices like Toeplitz, Hadamard, block matrix (see a taxonomy and possible applications here: http://en.wikipedia.org/wiki/List_of_matrices) and select algorithms that are more specific to each type in order to optimize code even further.

The last feature is highly desirable in Machine Learning, Big Data, Computer Vision, Data mining, etc...

Inspiration from other libraries like Eigen, Armadillo, GotoBLAS, etc... is highly recommended (after all, that's one of the raison d'etre of Free Software).

Mentor: David Bellot (david.bellot[at]gmail.com)

Boost.Math Bernoulli Numbers and Applications to Special Functions

Boost.Math http://www.boost.org/doc/libs/1_53_0/libs/math/doc/sf_and_dist/html/ is a large well-established Boost library, but new mathematical functions can always be added.

We wonder if students would be interested in adding Bernoulli numbers needed in several useful series.

Bernoulli numbers are rational numbers. Bernoulli numbers frequently arise in pure and applied mathematics in areas such as asymptotic approximations, Euler-Maclaurin summation and many others.

In this project, we will implement fast, accurate calculations of Bernoulli numbers for built-in types as well as multiprecision types. The Bernoulli numbers will subsequently be used for improving the existing implementation of gamma functions and adding brand new Boost.Math support for the polygamma function.

http://en.wikipedia.org/wiki/Bernoulli_number

http://mathworld.wolfram.com/BernoulliNumber.html

  • Add support for Bernoulli numbers along with thread-safe caching.
  • Add support for polygamma of positive integer order (requires Bernoulli numbers).
  • Improve tgamma/lgamma for multiprecision types based on Stirling's approx.
  • Optional: Add support for the Hurwitz zeta function.

This would clearly require some serious math skills, but also good knowledge of C++, especially using templates which Boost.Math makes much use of to provide not only built-in double and long double but recently multiprecision, large fixed and arbitrary precision. You will need to have experience of using Boost libraries, including Boost.Test, and using SVN and/or GIT.

If studying http://svn.boost.org/svn/boost/trunk/boost/math/special_functions/gamma.hpp leaves you frightened, then this project is not for you.

If you would like to demonstrate your skills, you might like to try coding the naive Akiyama–Tanigawa algorithm for second Bernoulli numbers Bn http://en.wikipedia.org/wiki/Bernoulli_number#Algorithmic_description using Boost.Math. Extra marks for providing a Boost.Test comparing with a handful of published values. You can use any platform, Linux, Mac or Microsoft with your IDE of choice, perhaps Visual Studio or NetBeans.

The project will by mentored by Paul Bristow for administration and Boost infrastructure, and John Maddock and Christopher Kormanyos for mathematical and algorithmic expertise.

Boost.odeint

Boost.odeint http://www.odeint.com is a library devoted to finding numerical solutions of ordinary differential equations (ODEs). It is developed in a generic way using Template Metaprogramming which leads to extraordinary high flexibility at top performance. It is widely used in academic and industrial projects. Boost.odeint has been a Google Summer of Code project in 2011 http://google-melange.appspot.com/gsoc/project/google/gsoc2011/mulansky/14001 and we would like again to see students involved in the development of this numerical library. Therefore, we offer the two following projects:

Implicit Routines

The current focus of odeint is on explicit routines, namely the Runge-Kutta schemes or the multi-step methods. However, we would like to expand odeint by adding implicit routines in the same flexible manner. Implicit routines are important when dealing with stiff systems, but also ensure stability in some cases of discretized partial differential equations (PDEs). At the moment, odeint provides an implicit Euler and Rosenbrock algorithm, but the implementation is restricted to Boost.uBlas. As one of the main objectives of odeint is to provide highly flexible algorithms, we want to change this implementation to have a greater flexibility and interchangeability similar to that of the explicit routines. This project would not only require pure coding, but first also a considerable amount of design work.

  • Develop a design on how the requirements on implicit routines can be modularized
  • Implement a prototype showing the usability of the design
  • Change the existing routines to the new design
  • Implement other backends (MTL, eigen,...) to demonstrate the flexibility
  • Provide examples and documentation
  • (Add more implicit routines, if time permits)

This project does not only require profound knowledge on C++ and generic programming, but we would also like to see some experience on numerical algorithms and solving ODEs, if possible.

The project will be mentored by Karsten Ahnert and Mario Mulansky.

Parallelized Backends

The modularized design of odeint allows it to be used in a wide variety of environments by implementing appropriate computational backends. For example, there already exist such backends for running odeint on graphic cards, both for OpenCL and CUDA. However, in this project we would like to see the applicability further expanded in terms of parallelized computation backends based on OpenMP, MPI and maybe also SIMD. Some prototypical OpenMP backends have already been developed and used, but have not yet reached a satisfactory level to be integrated with the rest of odeint. This project would mainly consist of implementing computational backends for odeint, benchmarking them in performance tests and providing examples on how to use these backends efficiently, maybe in terms of a short tutorial. For this work, despite profound knowledge on C++, it would also be helpful to have some experience with numerical programming and parallelization.

  • Develop parallelized computational backends (OpenMP, MPI, SIMD)
  • Implement performance tests
  • Add examples and tutorial

This project will be mentored by Karsten Ahnert and Mario Mulansky

Boost.Multiprecision

Boost.Multiprecision (http://www.boost.org/doc/libs/1_53_0/libs/multiprecision/doc/html/index.html) is a new Boost library that offers multiple precision integer, rational and floating-point types with precision exceeding those of built-in float, double and long double. Boost.Multiprecision uses a uniform architecture that embodies the extended precision type with a front-end generic number template combined with one of several back-end number types that grind out the nuts-and-bolts of the multiprecision work.

We are looking for a student to assist with writing a high-performance radix-2 floating-point back-end for Boost.Multiprecision. The current implementation uses radix-10 and suffers certain performance losses and lack of extensibility therefrom.

This is definitely an advanced project in C++ and algorithmic. Yet at the same time, it is an exciting and challenging task. It will require a high level of adeptness with C++ coding, mathematics, and algorithms. It combines high-performance with absolute error intolerance. As such, this project will hone the skills of the mathematical and algorithmic programmer and serve well as a research topic for students whose studies include algorithms.

Boost.Multiprecision already has a boost-licensed decimal (radix-10) floating-point back-end. This project will provide a binary (radix-2) floating point back-end that is expected to improve efficiency and provide a more natural conversion to and from C++ built-in floating-point types having radix-2 such as float, double, and long double.

We will now discuss various project details and sketch the general progression of the steps that will be undertaken.

The main steps of algorithmic design include

  • First of all, we will closely monitor the progress of this project and decide what precision ranges we would like to support.
  • The easiest precision level involves floating-point numbers with less than about 1,000 decimal digits of precision.
  • Further precision levels climb to thousands or even millions of decimal digits. These require algorithms of increasing complexity.
  • One of the first steps is to provide I/O handling for character-based strings and C++ <iostream> and <iomanip> support.
  • The next important step involves implementing the basic algebraic operations (+, -, *, /), in other words addition, subtraction, multiplication, and division, using a variety of techniques.
  • Multiplication is the performance bottleneck in multiprecision and we will be spending a lot of time here. Depending on the digit-range goals, we may optionally include high-performance with Karatsuba multiplication and potentially ultra-high performance with FFT-based multiplication.
  • Optionally, we will be providing support for the elementary transcendental functions listed in <cmath>, as is now done for the existing back-ends.
  • Finally, we need to ensure that the back-end seamlessly interoperates with Boost.Math.
  • Testing is essential in order to verify the correctness of the implementation. We will, therefore, be writing a variety of tests to validate the algebra as well as the functions.

The existing radix-10 implementation in <boost/multiprecision/cpp_dec_float.hpp> can serve as a guide for candidates interested in visualizing the level of complexity involved in this project. In addition, a sketch consisting of mainly a feasibility check for radix-2 can be found in the sandbox.

Heavy use will be made of

  • C++
  • Templates
  • The STL
  • Memory management
  • Schoolbook and Karatsuba multiplication
  • Fast Fourier transforms
  • Conversion to and from radix-2 and radix-10

The new reference book "Modern Computer Arithmetic" is a valuable source for the algorithms in this project. A draft version of the book is available free of charge and the printed copy can be found in a library or at a book seller.

This project will be mentored by Christopher Kormanyos

Boost.ASIO

Boost.ASIO is probably the leading asynchronous i/o implementation library in C++, followed in popularity only by libuv which is pure C. Parts of Boost.ASIO are expected to join the C++ standard library in TR2 as the official C++ networking implementation (see N3360 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3360.pdf). Yet something very obvious is missing: Boost.ASIO is missing asynchronous file i/o support for POSIX, yet there is a partial implementation for Microsoft Windows (http://www.boost.org/doc/libs/1_53_0/doc/html/boost_asio/overview/windows/random_access_handle.html).

A very nice and useful Boost GSoC project would be to fully complete asynchronous file i/o support for Boost.ASIO and to submit it as a complete new Boost library (Boost.AFSIO?) which ties together Boost.ASIO, Boost.Thread, Boost.Filesystem etc. And in fact I have implemented most of such a library (see https://github.com/ned14/TripleGit/blob/master/triplegit/include/async_file_io.hpp), but the following items remain outstanding before it could be accepted by Boost:

  1. POSIX async i/o support is limited to a threadpooled emulation. Proper support for Linux's kaio and POSIX's aio_* interfaces would be great. Note that these APIs are extremely tricky to get working right across all POSIX platforms, especially OS X.
  1. The file i/o APIs supplied are the bare minimum. A deeper, more explicit integration with Boost.Filesystem which is currently only supported using hand written template shims would be needed.
  1. Documentation is rudimentary, even by Boost standards.
  1. Unit testing is not to Boost standards.
  1. The existing implementation is exclusively C++11 and does not use Boost's macro trickery for supporting older compilers. Some refactoring would be needed here to pass Boost peer review.
  1. The library needs to pass Boost peer review, with all the bureaucracy that involves. Do NOT underestimate the difficulty of this part - in fact, this part alone is probably 50% or more of a GSoC project. Yet it is probably the most valuable part of a Boost GSoC internship because if successful, your resume can state that you're the (co)author of a Boost library. That's something only a few hundred people in the world can claim, and it will stand with you as a bright point for the rest of your career.

This project would be (co)mentored by Niall Douglas (http://www.nedprod.com/).

Benchmark Library

Benchmarking code is a very important task, especially for library authors that aim at high performance usage. So far, there is no benchmarking framework in boost. This project intends to start such a library with the (long time) perspective to be included into boost. There already is quite some demand for benchmarking in existing libraries like Boost.Math or Boost.odeint. This work would involve major library design steps, hence some experience with library development of the student is recommended, but no necessary requirement.

During the GSoC the student is expected to:

  • Analyze benchmark use cases in boost libraries
  • Create a simple framework to benchmark functions
  • Based on Boost.Timer
  • Investigate benchmark automatization with Boost.Jam
  • Develop exemplary benchmarks for some Boost libraries

This project will be mentored by Mario Mulansky

Boost.Functional/Invoke

Provide a boost::invoke function implementation of the c++11 definition of INVOKE for c++11 and C++03 compilers.

This project could be mentored by Vicente J. Botet Escriba

Boost.Thread/SyncQueue

Provide boost::sync_bounded/unbounded_queue implementations base on the C++ Concurrent Queues proposal [1].

An adaptation of the Boost.LockFree queues to follow the same interface would be welcome.

[1] http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3533.html#Managed

This project could be mentored by Vicente J. Botet Escriba

Boost.Thread/ThreadPool

Provide a boost::thread_pool class scheduling arbitrary functions based on [2] and extend the boost::async function to manage with this new class. [1] and [3] could be taken in consideration.

[1] https://svn.boost.org/svn/boost/sandbox/async/libs/tp/doc/html/index.html

[2] https://svn.boost.org/svn/boost/sandbox/async/libs/async/doc/html/index.html

[3] http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3562.pdf

This project could be mentored by Vicente J. Botet Escriba

Boost.TypeTraits/Extensions

Provide the Boost.TypeTraits missing C++11 type_traits is_assignable, is_constructible, ... implementation for c++11 and C++03 compilers.

There is an starting point implementation in [1].

[1] https://svn.boost.org/svn/boost/sandbox/conversion/libs/conversion_ext/doc/html/boost/conversion/reference.html#boost.conversion.reference.type_traits_extensions

This project could be mentored by Vicente J. Botet Escriba

Boost.FixedPoint

Provide an implementation of a FixedPoint library based on [1].

A prototype with a different interface is available at [2]

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3352.html

[2] https://svn.boost.org/svn/boost/sandbox/fixed_point/

This project could be mentored by Vicente J. Botet Escriba

Boost.Chrono/Date

Provide an implementation of a Date library based on [1].

A prototype with a different interface is available also at [2].

[1] https://svn.boost.org/svn/boost/sandbox/chrono_date/libs/date/doc/date.html

[2] https://svn.boost.org/svn/boost/sandbox/chrono_date/

This project could be mentored by Vicente J. Botet Escriba

Boost.Expected

Port to Boost the Expected library from Andrei Alexandrescu [1] adding a class expected-or-error_code and make it ready for review.

[1] http://channel9.msdn.com/Shows/Going+Deep/C-and-Beyond-2012-Andrei-Alexandrescu-Systematic-Error-Handling-in-C

This project could be mentored by Vicente J. Botet Escriba

Boost.Exception/StackUnwinding

Make a real Boost library the StackUnwinding library [1] from Evgeny Panasyuk ready for review.

[1] https://github.com/panaseleus/stack_unwinding#d-style-scope-guardsactions

Boost.Pipelines

Provide boost::pipelines implementations base on the C++ Pipelines proposal [1] and the Google implementation in [2]

[1] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3534.html [2] https://code.google.com/p/google-concurrency-library/source/browse/include/pipeline.h

This project could be mentored by Vicente J. Botet Escriba

Moving Boost to Boost.Move

There a re a lot of Boost libraries that don't support move semantics. It would be nice if one student propose to adapt the some of the existing libraries.

The idea is to use Boost.Move so that an emulation is provided for compilers not supporting rvalue references.

Some of the libraries (let me know if I'm wrong are). My priority is given between [], lower numbers means higher priority. Of course others would have others priorities:

C++11

  • [0] Tuple or * Fusion/tuple (it seems that it support or will support c++11 move semantics but don't use Boost.Move)
  • [0] Bind
  • [1] Function
  • [2] SmartPtr
  • [8] Array ?

Accepted for C++14

Having an active proposal for C++1y

Other

Please help me to complete this list.

Adding constexpr and noexcept would be welcome also.

This project could be mentored by Vicente J. Botet Escriba

Ideas

Here you find a lot of ideas which have been mentioned on the Boost mailing list. Boost developers can use these ideas to create new projects (and add them to the list of projects above). And students can use them as start points to propose projects themselves. While these ideas are by far not as concrete as projects in the list above, they give students an idea what else they can work on within Boost. They also give students a chance to propose exactly the kind of project they'd like to work on instead of picking one from the predefined projects from above.

Boost libraries (extending/overhauling/finishing)

Algorithms

  • Radix sort
  • Approximate string matching
  • Full text search
  • Near Duplicate Detection (shingling)
  • Parallel algorithms (sort, for_each)
  • Algorithms for gpgpu
  • Kinetic scrolling

Data strutures

  • Trie data structure
  • B-tree data structure
  • Concurrent containers (unordered_map, unordered_set, vector, forward_list)
  • Slim string (as opposed to the fat std::string interface, maybe immutable, maybe policy based)

Media

  • Hardware graphics (OpenGL/OpenGL ES/DirectX abstraction)
  • Audio library (OpenAL/FMOD/etc)
  • Video processing library (gil for video)

Games

  • Physics library (ODE/Havok/PhysX/etc abstraction)
  • Input library (like DirectInput/XInput)
  • Ranking algorithms (elo/TrueSkill)

Databases

Math

  • Geometry library (convince the developers to submit Eigen 3 to Boost)

Memory

  • Memcache library
  • Memswap algorithm

File formats

  • JSON parsing libary
  • XML library

Communication

GUI

  • GUI library

C++ "extensions"

Hardware

  • CPUID
Last modified 9 years ago Last modified on Apr 27, 2013, 6:26:26 AM
Note: See TracWiki for help on using the wiki.