| 160 | |
| 161 | |
| 162 | === 4. Boost.uBLAS a library for matrix computations === |
| 163 | Potential mentors: David Bellot |
| 164 | |
| 165 | ==== Background ==== |
| 166 | uBLAS is a library for linear algebra and matrix computations. Using recursive templates, it allows the compiler to optimize any complex linear algebra expressions as if it were written by hand by the programmer. Basic classes are matrix and vector. The library has all the basic functionalities and a few standard algorithms. However, it lacks more advanced algoritms like solvers and algoritms for decomposition. We have an old LU decomposition and that's it. We would like to improve the functionality of this library by adding more numerical solvers, especially for big problems in Big Data and algorithms for matrix decomposition like QR, Cholesky and also eigensolvers to do Schur or Jordan decomposition. All those algorithms are very important in Machine Learning nowadays and the base of many learning algorithms. |
| 167 | Another topic, that is of interest for Big Data and Machine Learning is doing statistics on matrices. Languages like R or Python (with Pandas) uses the notion of Data Frame and have many aggregation or grouping algorithm to generate all sorts of statistics on huge matrices. As it became a very important topic we would like to have similar functions in uBLAS. For example you can see libraries like Pandas (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html) or a very powerful R package name data.table (http://cran.r-project.org/web/packages/data.table/index.html). Having similar functionalities in ublas would be a must ! |
| 168 | |
| 169 | ==== GSoC project proposal ==== |
| 170 | |
| 171 | 1. For the solvers project: |
| 172 | - study the main solving and decomposition algorithms, |
| 173 | - study how to integrate an algorithm in uBLAS |
| 174 | - implement the algorithm |
| 175 | - write extensive test for not only testing the correctness of the algorithms but also the speed and the numerical stability. The last point is presumably one of the most important |
| 176 | |
| 177 | 2. For the data frame project: |
| 178 | - well... simply replication R data.frame and that's great ! |
| 179 | - ideally, implement the aggregate and summary function for data.frame and matrix like it is in R |
| 180 | - add rows and columns statistics like it is in Matlab |
| 181 | |
| 182 | ==== Programming competency test ==== |
| 183 | Implement a Gaussian elimination as described in http://en.wikipedia.org/wiki/Gaussian_elimination in uBLAS. Write a code to test the numerical stability of the algorithm. |
| 184 | The competency test will be valid if |
| 185 | 1- the algorithm is correctly implemented (of course) |
| 186 | 2- integrated into uBLAS |
| 187 | 3- has a testing code |
| 188 | |
| 189 | Submission of the programming test should be via copying and pasting what you wrote into the end of the proposal you submit to Google Melange. |