wiki:SoC2016

Version 11 (modified by mariomulansky, 7 years ago) ( diff )

added event detection odeint project

Boost Google Summer of Code 2016

Welcome to the Boost C++ Libraries' home page for Google Summer of Code (GSoC) 2016. This page provides information about suggested student projects, proposal submission templates, advice on writing good proposals, and links to information on getting started writing with Boost.

Quick summary of policies and processes for this year

After two extremely successful years of GSoC at Boost during which even some of the old timers have been wowed by the C++ code some of the students have brought us, we are minded to keep in place the more rigorous candidate selection process which involves preferentially selecting over all others every GSoC Boost candidate who takes a C++ programming aptitude test or provides links to at least 1,000 lines (excluding comments and whitespace) non-coursework C++ library (not application nor solution) open source code. Note if following the second route, code should have been open sourced at least three months ago, and show a log of commits improving the library over time.

What students should do now

  1. Students should review the list of ideas from previous GSoCs and the archives of the Boost developer's mailing list relating to GSoC (tip: try searching boost-dev for subjects tagged [gsoc16] or [gsoc15] etc). You may find this searchable archive of boost-dev useful.
  1. If you wish to proceed, you need to join the Boost Developer's mailing list and find a mentor who will be an experienced Boost developer in one of the Boost libraries listed at http://www.boost.org/doc/libs/1_60_0/. Read the Boost Discussion Policy in full, and once read in full go to http://lists.boost.org/mailman/listinfo.cgi/boost and subscribe.
  1. After as an absolute minimum reading all posts tagged [gsoc16], students should write a well researched and intelligent message with [gsoc16] at the front of the subject line to that developer's mailing list seeking a mentor, and be as flexible as possible in finding a topic that both they and the mentor is interested in upon which to base a GSoC project proposal text to be submitted to Google.

As a general rule, a well written and researched proposal to extend or improve an existing mature Boost library is likely to be much better received that student originated ideas for new libraries or facilities.

  1. Once a potential mentor and project idea is found, the student must write a project proposal which should follow this submission template.

Potential mentors may add precanned project ideas with programming competency tests to this page below as GSoC approaches. These may prove useful in starting a discussion with potential mentor(s) whom the student should approach directly.

Github's for standalone GSoCs past and present

Since 2013 with Boost's transition to git we have kept a single umbrella org on github for those GSoCs which are fairly self standing. Incremental extensions to existing libraries usually enter that library's main git repo as an experimental branch. Here are those orgs:

Students may find examining past GSoC source code and commit histories of use.

Historical GSoC Ideas pages for years 2006 to now

Suggested GSoC project proposals

To any potential mentor adding a proposal HERE please use this template

1. Static map (boost::static_map<Key, T, ConstExprHash = boost::constexpr_hash<Key>, Pred = boost::constexpr_equal_to<Key>>)

Potential mentors: Niall Douglas

Background

Even with all of Boost's facilities and using C++ 14, this program represents the currently best available method of implementing a static constant associative map of keys to values (play with it in an online compiler here and see the assembler generated here):

#include <initializer_list>
#include <experimental/string_view>
#include <map>
#include <unordered_map>
#include <iostream>

using std::experimental::string_view;

enum class weekday { sunday, monday, tuesday, wednesday, thursday, friday, saturday };

// initializer_list, pair and string_view are constexpr, so this list can be constexpr
// (i.e. lives in the mind of the compiler only and has zero representation in generated code)
#define STRING_VIEW(str) { str, sizeof(str)-1 }
constexpr std::initializer_list<std::pair<const string_view, weekday>> string_to_weekday {
    { STRING_VIEW("sunday"),    weekday::sunday },
    { STRING_VIEW("monday"),    weekday::monday },
    { STRING_VIEW("tuesday"),   weekday::tuesday },
    { STRING_VIEW("wednesday"), weekday::wednesday },
    { STRING_VIEW("thursday"),  weekday::thursday },
    { STRING_VIEW("friday"),    weekday::friday },
    { STRING_VIEW("saturday"),  weekday::saturday }
};

int main(void)
{
  {
    // Calls malloc() at least 7 times
    static const std::map<string_view, weekday> to_weekday1 = string_to_weekday;
    std::cout << "'monday' maps to " << static_cast<int>(to_weekday1.at("monday")) << std::endl;
    std::cout << "'friday' maps to " << static_cast<int>(to_weekday1.at("friday")) << std::endl;
    // Calls free() at least 7 times
  }
  {
    // Calls malloc() at least 8 times
    static const std::unordered_map<string_view, weekday> to_weekday2 = string_to_weekday;
    std::cout << "'monday' maps to " << static_cast<int>(to_weekday2.at("monday")) << std::endl;
    std::cout << "'friday' maps to " << static_cast<int>(to_weekday2.at("friday")) << std::endl;
    // Calls free() at least 8 times
  }
  return 0;
}

As you can see, we have either the choice of linearly scanning the associative array string_to_weekday for keys (which has linear complexity to number of items per lookup), or we can copy at runtime the associative array into an associative map which can thereafter be queried at runtime in either O(log N) or O(1) complexities respectively. Both these options generate very significant amounts of runtime code.

Almost every seasoned C++ programmer has at least once (and often many times throughout a career) written a static constant associative map which is initialised once from static constant data usually at process start, and is thereafter never modified. Given that C++ 14 can implement an associative map where key lookups are constexpr when possible, it would be a very worthwhile addition to Boost and C++ in general if we gained a high quality implementation of a static map class.

GSoC project proposal

  1. To seek consensus from the Boost Developer's mailing list on a suitable design for a Boost static_map class with the following design features:
    1. Whose number of items and keys are completely fixed from construction onwards.
    2. All features of which can be used in constant expressions (i.e. all member functions are marked constexpr).
    3. Which can be completely statically initialised in the mind of the compiler, or in static global storage.
    4. Values, though not keys nor number of items, are modifiable.
    5. Which performs constexpr key to value lookup to the maximum possible extent.

It turns out this list of requirements is in fact quite challenging to implement, and even seasoned Boost programmers have to stop and ponder this one. To demonstrate the difficulty in code:

constexpr std::pair<int, const char *> map_data[] = {
  { 5, "apple" },
  { 8, "pear" },
  { 0, "banana" }
};
// Easy: generates no runtime code
constexpr auto cmap = make_static_map(map_data);
// Easy: generates no runtime code, this is as if literal "apple"
constexpr const char *what_is_5 = cmap[5];

// Challenging: needs to only generate code loading immediately from a memory location.
// It must NOT generate any additional runtime overhead like hashing nor searching.
const char *what_is_8 = cmap[8];

The reason why the last statement is challenging is because of the rules of when a constexpr-capable code path is executed by the compiler: the compiler only has to constexpr execute when all the inputs are constant expressions and all the outputs will be used in constant expressions, and it only may constexpr execute when all the inputs are constant expressions. Indeed empirical testing of clang 3.7 and VS2015 found both currently do not constexpr execute when the result is not stored constexpr, so in a naive implementation the last statement does generate a runtime lookup on current compiler technology despite the constant expression inputs. If you'd like to see more detail about this specific problem, have a look at the assembler generated for a toy static_map implementation at https://goo.gl/eO7ooa.

  1. To implement a static_map class which runs on at least two major C++ compilers.
    • Hopefully in C++ 14, however upcoming features in C++ 1z do make the problem considerably easier to fix.
  1. To implement a comprehensive unit test suite for the static_map class, including tests ensuring no runtime overhead is generated for the challenging use case exampled above.
  1. To configure per-commit continuous integration for the unit test suite on at least one of the major public open source CI services (e.g. Travis, Appveyor).
  1. To write documentation to Boost quality levels for the new container class, including time and space complexity guarantees and benchmarks, and exception guarantees for each API and each use of each API.

Potential project extension funded by Boost

  • static_multimap (keys are not unique)
  • static_set and static_multiset (keys only)
  • static_ordered_map and static_ordered_multimap (iteration is in lexical order of keys)
  • static_bimap (values are also keys)
  • static_index<T...> (multi-column table with some columns being hash indexed)

Programming competency test

Write a function which uses one of the compile time string hashing techniques at https://stackoverflow.com/questions/2111667/compile-time-string-hashing (or a constexpr hash function of your preference) to generate a constexpr std::array<unsigned> of its input strings. The function ought to have the following prototype:

template<class... Strings> constexpr std::array<unsigned, sizeof...(Strings)> hash_strings(Strings&&... strings);

// Examples of usage
constexpr std::array<unsigned, 0> string_hashes0 = hash_strings();
constexpr std::array<unsigned, 2> string_hashes2 = hash_strings("niall", "douglas");
constexpr std::array<unsigned, 4> string_hashes4 = hash_strings("google", "summer", "of", "code");
// This should fail elegantly and usefully ...
//constexpr std::array<unsigned, 0> string_hashes_fail = hash_strings(5);

In your submission you should:

  1. Explain why you chose the constexpr string hash function you did and the strengths of your choice over other choices.
  2. Test that the output is identical whether executed by the compiler at compile time, or at runtime.
  3. Prove that the compile time constexpr implementation generates no runtime code whatsoever.

Submission of the programming test should be via copying and pasting the code you wrote into the end of the proposal you submit to Google Melange.

2. Boost.odeint

Potential mentors: Karsten Ahnert and Mario Mulansky

Background

Boost.odeint http://www.odeint.com is a library devoted to finding numerical solutions of ordinary differential equations (ODEs). It is developed in a generic way using Template Metaprogramming which leads to extraordinary high flexibility at top performance. It is widely used in academic and industrial projects. Boost.odeint has been a Google Summer of Code project in 2011 and 2013 and we would like again to see students involved in the development of this numerical library.

Project 1: Implicit Routines

The current focus of odeint is on explicit routines, namely the Runge-Kutta schemes or the multi-step methods. However, we would like to expand odeint by adding implicit routines in the same flexible manner. Implicit routines are important when dealing with stiff systems, but also ensure stability in some cases of discretized partial differential equations (PDEs). At the moment, odeint provides an implicit Euler and Rosenbrock algorithm, but the implementation is restricted to Boost.uBlas. As one of the main objectives of odeint is to provide highly flexible algorithms, we want to change this implementation to have a greater flexibility and interchangeability similar to that of the explicit routines. This project would not only require pure coding, but first also a considerable amount of design work.

This project does not only require profound knowledge on C++ and generic programming, but we would also like to see some experience on numerical algorithms and solving ODEs, if possible.

  • Develop a design on how the requirements on implicit routines can be modularized
  • Implement a prototype showing the usability of the design
  • Change the existing routines to the new design
  • Implement other backends (MTL, eigen,...) to demonstrate the flexibility
  • Provide examples and documentation
  • (Add more implicit routines, if time permits)

Project 2: Event Detection

Currently, Boost.odeint does not offer specific functionality for event detection. However, this is a common use-case and has been discussed on the odeint Github page in the past. Some examplary implementation of an event-detection algorithm was added recently as an example. On the other hand, sophisticated event detection algorithms have been explored in the context of computing Poincare sections. The fundamental task of this project, however, is to develop a clean, generic interface to incorporate event detection algorithms into the existing structure of Boost.odeint.

The project requires profound knowledge of C++ and generic programming, as well as a good understanding of Boost.odeint and ODE simulations and numerical algorithms.

  • Design an interface to incorporate event detection into Boost.odeint
  • Prototypical implementation and testing the usability of the design in several use cases
  • Implementation of sophisticated algorithms for event detection
  • Documentation and examples

3. Boost.Python improvements

Potential mentor: Stefan Seefeld

This project requires knowledge of C++11 and Python

Potential projects:

  • Modernize code base by using standard C++11 features rather than their equivalent Boost APIs (type traits, notably).
  • Integrate NumPy support into Boost.Python (https://github.com/ndarray/Boost.NumPy), or prepare for stand-alone review.

To any potential mentor adding a proposal HERE please use this template

Note: See TracWiki for help on using the wiki.