wiki:SoC2011

Version 8 (modified by Mathias Gaunard, 12 years ago) ( diff )

--

Google Summer of Code 2011

Welcome to the Boost C++ Libraries' home page for Google Summer of Code (GSoc). This page provides information about student projects, proposal submission templates, advice on writing good proposals, and links to information on getting started writing with Boost.

This year Boost is looking to fund work on a number of different kinds of proposals:

  • toolkit-like extensions to existing libraries,
  • finishing or extend sandbox libraries,
  • new data structures and algorithms, and
  • multiple competing proposals for the same project.

For projects involving new or experimental libraries, the process of getting source code "Boost-branded" can take much longer than a single summer. In many cases, it can take much longer than a single year. Even if a library is accepted, there is an expectation that the original author will continue to maintain it. Building a library as part of Boost can easily entail a multi-year commitment. For this reason, we are willing to consider multi-year GSoC projects. However, prospective students must limit the scope of their work to a single summer. We may invite the most successful students to re-apply in 2012.

Requirements

There are only two requirements of students submitting Boost projects:

  • Develop a proposal
  • Complete an aptitude test

These requirements also entail interacting with the Boost community on the development mailing list (http://www.boost.org/community/groups.html), checking source code out of the Boost Subversion repository, programming, and compiling a project.

Developing a Proposal

The Aptitude Test

Projects

The following projects have been suggested by potential mentors. If the descriptions of these projects seem a little vague... Well, that's intentional We are looking for students to develop requirements for their proposals by doing initial background research on the topic, and interacting with the community on the mailing list to help identify expectations.

Boost.Polygon edge, polyline and edge set concepts

The polygon library is missing an edge concept. There is a lot of code in the library that operates on edges, but no interface that exposes those operations to the user. Also, there are a number of "edge set" algorithms that are of interest, so in addition to edge concept the library also needs an "edge set" concept. Operations on edge set include booleans (intersection, union) and connectivity extraction. Generalizing these to map overlay of edge sets would allow a single generic algorithm to implement all of these operations similar to how polygon set operations are already implemented. The addition of a polyline concept, of which polygon would be a refinement that restricts to closed cycle polylines, would allow polygons to interoperate with edge sets in a very natural and productive syntax. This project would be a great opportunity to learn about concept based generic programming and type systems, how to design and implement generic algorithms and computational geometry in general.

Boost.Python and NumPy

Boost.Python currently has limited support for NumPy arrays.

Mentors: Stefan Seefeld

Checks & Hashes

Check strings and digits are an invaluable tool for avoiding mistakes in data entry, storage and transmission.

There are many public algorithms available, but not a coherent collection of C++ functions.

The suggested project is to provide such a collection which is in a coherent format, fully tested (using Boost.Test) (including tests with various faulty input) and very fully documented to Boost Quality, using Quickbook, Doxygen, and AutoIndex in both html and pdf.

A key target is to get it to a finished state, rather than to deal with all possible check types.

Much code is already available (from Boost and elsewhere) (and I can contribute some to get off to a quicker start) so the project involves gathering it, testing and documenting rather than much complex coding.

A key design decision the student must take is what format (and names) the functions should take.

Any platform is OK, but it must use bjam to drive the build process. A good demonstration would be to 'package up' something trivially simple like ISBN or something from Boost Cyclic redundancy checks, preparing a jamfile, some Boost style tests, and some skeleton documentation in Quickbook.

A few sample possible checks:

Simple modulo 256 etc check values and digits.

Boost's Cyclic redundancy checks codes http://www.boost.org/doc/libs/1_45_0/libs/crc/index.html

http://www.netrino.com/Embedded-Systems/How-To/CRC-Calculation-C-Code

crc_16_type BISYNCH, ARC crc_ccitt_type designated by CCITT (Comité Consultatif International Télégraphique et Téléphonique) crc_xmodem_type XMODEM crc_32_type PKZip, AUTODIN II, Ethernet, FDDI

MD5 hash http://www.md5.net/

SHA hashes http://en.wikipedia.org/wiki/SHA-1 ...

Luhn algorithm http://en.wikipedia.org/wiki/Luhn_algorithm

Verhoeff algorithm http://en.wikipedia.org/wiki/Verhoeff_algorithm

(These two are used by many of the others below).

European Article numbering EAN Symbol Specification Manual,

Universal Product Code, Uniform Code Council, Dayton, Ohio, USA.

Version of check used by Mastercard, VISA, and most other credit card companies. http://www.beachnet.com/~hstiles/cardtype.html

Generalised to arbitrary radix version allowing any characters (not just digits). Gene Callahan, Dr Dobb's Journal, Dec 1995, 131, 132 & 149. Generating Sequential keys in an Arbitrary Radix.

IBAN International Banking format http://en.wikipedia.org/wiki/International_Bank_Account_Number

ISBN http://en.wikipedia.org/wiki/International_Standard_Book_Number

ISSN http://en.wikipedia.org/wiki/International_Standard_Serial_Number

And there are many, many more potential.

Mentor(s): Paul A. Bristow and others?

SIMD library

SIMD is a class of instruction sets in processors that allow to execute an operation on multiple data elements simultaneously; those instructions are also referred to as vector instructions.
Popular examples of SIMD instruction sets include MMX, SSE, and AltiVec.

A SIMD abstraction component has been in development for several years as part of the NT2 project, and effort is being done to retrofit it to a Boost library.

The project involves consolidating support for non-intel processors, in particular the AltiVec instruction set, and polishing the library on all aspects (better docs, examples, tests, benchmarks and general boostification improvements).
Benchmarks are especially important due to the nature of the library, and are necessary so as to validate the work that has been done.

A talk is planned for Boostcon 2011 (May 15-20) to demonstrate the library, which by that point will already be somewhat boostified by NT2 developers.

SSH access to PowerPC G5 and Cell computers will be given so as to execute the work.

Requirements for the students are as follows:

  • Solid knowledge of modern C++, comfortable with both low-level code that deals with memory and template code.
    Experience with Boost.Proto a plus, but not strictly required.
  • Basic understanding of SIMD. AltiVec is very orthogonal, and therefore very simple, in comparison to SSE, so it can be gotten used to quickly.

Mentors: Joel Falcou and Mathias Gaunard

Note: See TracWiki for help on using the wiki.