wiki:SoC2011

Google Summer of Code 2011

Welcome to the Boost C++ Libraries' home page for Google Summer of Code (GSoc). This page provides information about student projects, proposal submission templates, advice on writing good proposals, and links to information on getting started writing with Boost.

This year Boost is looking to fund work on a number of different kinds of proposals:

  • toolkit-like extensions to existing libraries,
  • finishing or extend sandbox libraries,
  • new data structures and algorithms, and
  • multiple competing proposals for the same project.

For projects involving new or experimental libraries, the process of getting source code "Boost-branded" can take much longer than a single summer. In many cases, it can take much longer than a single year. Even if a library is accepted, there is an expectation that the original author will continue to maintain it. Building a library as part of Boost can easily entail a multi-year commitment. For this reason, we are willing to consider multi-year GSoC projects. However, prospective students must limit the scope of their work to a single summer. We may invite the most successful students to re-apply in 2012.

Requirements

Students must submit a proposal. A template for the proposal can be found here here. Hints for writing a good proposal can be found here.

We strongly suggest that students interested in developing a proposal for Boost discuss their ideas on the mailing list in order to help refine the requirements and goals. Students who actively discuss projects on the mailing list are also ranked before those that do not.

Projects

The following projects have been suggested by potential mentors. If the descriptions of these projects seem a little vague... Well, that's intentional We are looking for students to develop requirements for their proposals by doing initial background research on the topic, and interacting with the community on the mailing list to help identify expectations.

Projects from previous years can be found here. There are still a number of interesting projects found in these pages.

Boost.Polygon edge, polyline and edge set concepts

The polygon library is missing an edge concept. There is a lot of code in the library that operates on edges, but no interface that exposes those operations to the user. Also, there are a number of "edge set" algorithms that are of interest, so in addition to edge concept the library also needs an "edge set" concept. Operations on edge set include booleans (intersection, union) and connectivity extraction. Generalizing these to map overlay of edge sets would allow a single generic algorithm to implement all of these operations similar to how polygon set operations are already implemented. The addition of a polyline concept, of which polygon would be a refinement that restricts to closed cycle polylines, would allow polygons to interoperate with edge sets in a very natural and productive syntax. This project would be a great opportunity to learn about concept based generic programming and type systems, how to design and implement generic algorithms and computational geometry in general.

Mentor: Lucanus J. Simonson

Boost.Python and NumPy

Boost.Python currently has limited support for NumPy arrays. The library can be extended to support and interoperate with these and other Python data structures.

Mentors: Stefan Seefeld

Checks & Hashes

Check strings and digits are an invaluable tool for avoiding mistakes in data entry, storage and transmission.

There are many public algorithms available, but not a coherent collection of C++ functions.

The suggested project is to provide such a collection which is in a coherent format, fully tested (using Boost.Test) (including tests with various faulty input) and very fully documented to Boost Quality, using Quickbook, Doxygen, and AutoIndex in both html and pdf.

A key target is to get it to a finished state, rather than to deal with all possible check types.

Much code is already available (from Boost and elsewhere) (and I can contribute some to get off to a quicker start) so the project involves gathering it, testing and documenting rather than much complex coding.

A key design decision the student must take is what format (and names) the functions should take.

Any platform is OK, but it must use bjam to drive the build process. A good demonstration would be to 'package up' something trivially simple like ISBN or something from Boost Cyclic redundancy checks, preparing a jamfile, some Boost style tests, and some skeleton documentation in Quickbook.

A few sample possible checks:

Simple modulo 256 etc check values and digits.

Boost's Cyclic redundancy checks codes http://www.boost.org/doc/libs/1_45_0/libs/crc/index.html

http://www.netrino.com/Embedded-Systems/How-To/CRC-Calculation-C-Code

crc_16_type BISYNCH, ARC crc_ccitt_type designated by CCITT (Comité Consultatif International Télégraphique et Téléphonique) crc_xmodem_type XMODEM crc_32_type PKZip, AUTODIN II, Ethernet, FDDI

MD5 hash http://www.md5.net/

SHA hashes http://en.wikipedia.org/wiki/SHA-1 ...

Luhn algorithm http://en.wikipedia.org/wiki/Luhn_algorithm

Verhoeff algorithm http://en.wikipedia.org/wiki/Verhoeff_algorithm

(These two are used by many of the others below).

European Article numbering EAN Symbol Specification Manual,

Universal Product Code, Uniform Code Council, Dayton, Ohio, USA.

Version of check used by Mastercard, VISA, and most other credit card companies. http://www.beachnet.com/~hstiles/cardtype.html

Generalised to arbitrary radix version allowing any characters (not just digits). Gene Callahan, Dr Dobb's Journal, Dec 1995, 131, 132 & 149. Generating Sequential keys in an Arbitrary Radix.

IBAN International Banking format http://en.wikipedia.org/wiki/International_Bank_Account_Number

ISBN http://en.wikipedia.org/wiki/International_Standard_Book_Number

ISSN http://en.wikipedia.org/wiki/International_Standard_Serial_Number

And there are many, many more potential.

Mentor(s): Paul A. Bristow and others?

SIMD library

SIMD is a class of instruction sets in processors that allow to execute an operation on multiple data elements simultaneously; those instructions are also referred to as vector instructions.
Popular examples of SIMD instruction sets include MMX, SSE, and AltiVec.

A SIMD abstraction component has been in development for several years as part of the NT2 project, and effort is being done to retrofit it to a Boost library.

The project involves consolidating support for non-intel processors, in particular the AltiVec instruction set, and polishing the library on all aspects (better docs, examples, tests, benchmarks and general boostification improvements).
Benchmarks are especially important due to the nature of the library, and are necessary so as to validate the work that has been done.

Implementing saturated arithmetic (i.e. values stay at the minimum or maximum instead of overflowing) could also be part of the project. Those are provided as-is by AltiVec, but a software fallback should be implemented as well.

A talk is planned for Boostcon 2011 (May 15-20) to demonstrate the library, which by that point will already be somewhat boostified by NT2 developers.

SSH access to PowerPC G5 and Cell computers will be given so as to execute the work.

Requirements for the students are as follows:

  • Solid knowledge of modern C++, comfortable with both low-level code that deals with memory and template code.
    Experience with Boost.Proto a plus, but not strictly required.
  • Basic understanding of SIMD. AltiVec is very orthogonal, and therefore very simple, in comparison to SSE, so it can be gotten used to quickly.

Mentors: Joel Falcou and Mathias Gaunard

ConcepTraits library

The ConceptTraits library was abandoned when Concepts became part of the C++ standard. Unfortunately the concept feature will be missing for the next standard.

This library was composed mainly of 3 parts:

  • operators traits
  • macros to generate member traits
  • concept traits

The two first parts have been managed by Boost.TypeTraits operator extension reviewed in Mars and Boost.TTI respectively.

It would be great to finish the 3rd part and adapt it to the new libraries.

Boost.ConceptTraits

Mentors: Vicente J. Botet Escriba

Boost.Process

Boost.Process should become the library to manage system processes. While a first version was created in 2006 (in a GSoC project), we never managed to finish the library. Not that there were no attempts in the past years - we even had a review in February 2011. However even with the latest version (known as 0.4) developers in the Boost community were not happy with.

A lot of work was done in the GSoC 2010 program when we created the current version 0.4. You should make yourself familiar with that version - please find the documentation at http://www.highscore.de/boost/gsoc2010/ and the library at http://www.highscore.de/boost/gsoc2010/process.zip. Please also read the conclusions of the review at http://article.gmane.org/gmane.comp.lib.boost.user/66363. They give you an idea what the next steps will be and what you could work on.

Mentor: Boris Schaeling (boris[at]highscore.de)

Last modified 10 years ago Last modified on Feb 19, 2013, 11:07:05 AM
Note: See TracWiki for help on using the wiki.