| 200 | === 2. Boost.uBLAS: linear algebra and matrix computations === |
| 201 | Potential mentors: David Bellot |
| 202 | |
| 203 | All projects with Boost.uBLAS requires knowledge of C++11. |
| 204 | |
| 205 | ==== Background ==== |
| 206 | uBLAS is a library for linear algebra and matrix computations. Using recursive templates, it allows the compiler to optimize any complex linear algebra expressions as if it were written by hand by the programmer. Basic classes are matrix and vector. The library has all the basic functionalities and a few standard algorithms. We would like to improve the functionality of this library by adding new algorithms and functionality especially in the field of data analysis and machine learning. |
| 207 | |
| 208 | ==== PROJECT 1 : Add Multicore and GPU computations to uBLAS ==== |
| 209 | |
| 210 | The project description is simple: add support of multicore parallel and GPU computations to uBlas ! The realization is not straightforward though. |
| 211 | Boost supports parallel/GPU computations thanks to the Boost.Compute library (http://www.boost.org/doc/libs/1_63_0/libs/compute/doc/html/index.html). Boost.uBlas is CPU only. If the compilers is able to vectorize, uBlas can take benefit of it. Here we want to extend Boost to the support of parallel architecture and GPU computations to enable it to do big data or deep learning computations. |
| 212 | |
| 213 | The student will have to first understand how ublas works and how it generates and optimizes code with the expression template mechanism and then start adding options to enable the use of Boost.Compute. Test will be done on multicore systems and graphics card or computers which support Boost.Compute (through OpenCL for example). |
| 214 | |
| 215 | We expect to see the basic matrix operations to be implemented like this. The code will have to be thoroughly documented and a tutorial document provided. We prefer quality of the implementation to exhaustivity. |
| 216 | |
| 217 | For exceptionally good and fast students, extensions to support other library will be considered, like nVidia CUDA for example. |
| 218 | |
| 219 | In other words it will have to be clean and super fast ! |
| 220 | |
| 221 | ==== PROJECT 2 : Data.Frames in boost.uBlas ==== |
| 222 | |
| 223 | Languages like R or Python (with Pandas) uses the notion of Data Frame and have many aggregation or grouping algorithm to generate all sorts of statistics on huge matrices. As it became a very important topic we would like to have similar functions in uBLAS. For example you can see libraries like Pandas ( http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html) or a very powerful R package name data.table ( http://cran.r-project.org/web/packages/data.table/index.html). Having similar functionalities in ublas would be a must! |
| 224 | |
| 225 | The project will require the student to understand the basics of R data.frame and see what kind of limitations arise when it has to be implemented with a template meta-program in C++. However, the project will require the student to also identify all the possible optimizations that can't be done with generic purpose data.frame in R and Python because of missing information (like column types), etc... |
| 226 | |
| 227 | Finally, the student is expected to implement algorithms on the data.frame that can potentially be re-used on matrices too like subset selection with generic operators, statistics and summaries. Understanding memory management, alignment, optimizations, vector processing is not mandatory but most welcome. |
| 228 | |
| 229 | Understanding C++ expression template and C++ meta-programming is required. |
| 230 | |
| 231 | The student will start by studying existing implementations and propose a design. Then he or she will implement a prototype with tests and benchmarks. The final stage will be a thorough integration into ublas, and especially writing examples and documentation. |
| 232 | |
| 233 | ==== Potential project extension funded by Boost ==== |
| 234 | Not as much detail as the GSoC proposal itself, but enough detail that such an extension could be specced out. |
| 235 | |
| 236 | If there is no good extension, simply say N/A here. |
| 237 | |
| 238 | ==== Programming competency test ==== |
| 239 | |
| 240 | Implement a small library in C++ using expression templates and C++17 features like generic lambdas to: |
| 241 | 1. represent a matrix of numerical types (int, long, float, double, complex,....) |
| 242 | 2. compute simple algebraic expressions including + and * |
| 243 | 3. fit in one header file |
| 244 | |