This wiki page will be the central hub of information about the R Project participation in the Google Summer of Code (GSoC) for 2012. Administrators are Toby Dylan Hocking, John C. Nash, with Brian Peterson and Virgilio Gomez Rubio as a backup. Questions should be posed to the google group!
Everyone who wants to participate in this year’s Google Summer of Code with R please join our google group: gsoc-r@googlegroups.com
In short, each student will get paid $5000 to work on an R package for 3 months during the summer:
Previous projects are listed on the soc08, soc09, soc10, and soc11 pages.
Submission of unsolicited proposals (via, say, adding to the list below) is certainly welcome. However, each proposal has a big barrier to pass: it will to need explain why this particular project is of use to the community, and it will have to show how it can be achieved in three months. At a minimum, a submission should also include a review of related packages that address the same or similar problems and discuss how these packages are not sufficient to the task.
| Mentors | Proposal | Status | Students | Results |
|---|---|---|---|---|
| Project Template | Project Suggestion Template | Use this to add new proposals by copying the page content | ||
| Y. Richet, backup John Nash | Handle parallel (vectorized) objective functions in a new optimization wrapper package | inquiries from 2 candidates, not accepted | ||
| C. Boettiger | Wikipedia Articles | not accepted | Christoph Molnar | |
| Toby Dylan Hocking, Yihui Xie | Interactive animations for exploring high-dimensional and time series data | 2 applications not accepted | Deeptanshu Jha, Eric Momsen | |
| Eric Chu, H. W. Borchers | Implement an optimization modeling tool or solver for Disciplined Convex Programming | will be pursued outside of gsoc | ||
| Toby Dylan Hocking, Paul Murrell | Label plots using images rather than factor names | not accepted | Christoph Molnar | |
| V. Gómez-Rubio | Translate Matlab code for Bayesian methods in the Spatial Econometrics Toolbox into R | accepted | Abhirup Mallik | EDITME |
| David Carino | Portfolio Performance Measurement and Benchmarking | accepted | Andrii Babii | EDITME |
| Rafa ? | BigMatrix: Super Scalable Predictive Analytics for Big Matrices | accepted | Fang ? | EDITME |
| Guy Yollin, Brian Peterson | Add additional closed form and global optimizers to portfolio optimization framework | accepted | Hezky Varon | EDITME |
| Kris Boudt | Extend RTAQ for additional high frequency time series analysis | accepted | Jonathan Cornelissen | https://r-forge.r-project.org/projects/highfrequency/ for the code and https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/highfrequency/inst/doc/highfrequency.pdf?root=highfrequency for the vignette – The highfrequency package contains an extensive toolkit for the use of highfrequency financial data in R. It offers functionality to manage, clean and match highfrequency trades and quotes data. Furthermore, it enables users to: calculate easily various liquidity measures, estimate and forecast volatility, and investigate microstructure noise and intraday periodicity – In the future, some work could still be done on the implementation of forecasting models based on highfrequency data, testing for jumps using realized volatility estimates, and many other interesting applications of high frequency data analysis. |
| H. W. Borchers | Develop an R package interfacing the computer algebra system Maxima | accepted | Kseniia Shumelchyk | RMaxima Project Page – RMaxima enables the user to communicate with the CAS Maxima (at the moment Linux only). A Windows version, easier installation, and ways to avoid Maxima syntax will be needed to make it a popular R package. |
| Brian Peterson | Convert Meucci's Matlab Code to R | accepted | Manan Shah | EDITME |
| Peter Carl | Additional performance measures and attribution functionality | accepted | Matthieu ? | EDITME |
| Joshua Ulrich, Stuart Greenlee | Improvements to xts time series visualization and subsetting | accepted | Michael Weylandt | Code is available in the xtsExtra package on R-Forge. Code provides a vastly expanded plot.xts method, and xts wrappers to commonly-used analytic functions (e.g. arima, acf, HoltWinters, etc). There is a prototype for the multi-type xts object portion of the project; finishing that implementation could be a project for another student. |
| Claudia Beleites | Handling of spectroscopic data sets in R | accepted | Simon Fuller | code in github and R-Forge, blog, import functions for binary formats written, talk to Claudia for ideas for next year |
| Yihui Xie | Dynamic report generation in the web with R | accepted | Taiyun Wei | code developed; at the moment no enough work left for the next GSoC |
| Hadley Wickham, John Nash, Di Cook | Aggregate CRAN package download statistics across multiple mirrors | accepted | Timothy Jurka | Live site is available, but no CRAN mirrors are supplying data at the moment. However, some sample data has been uploaded, and can be viewed in “Year to Date” view. |
| cbhurley | Interactive dendrograms | accepted | Tomáš Sieger | EDITME |
| Roeder ? | SAM: A General-purpose classifier for modern predictive data analysis | accepted | Tour ? | EDITME |
| Carl Boettiger, Karthik Ram, Scott Chamberlain | Dynamic access and visualization of scientific data repositories | accepted | Vijay Barve | code for rgbif hosted at GitHub here, student's blog posts, R interface to Global Biodiversity Information Facility (GBIF). Could be more work on rgbif and related packages next year |
| Han Liu | Biganalysis: a robust, general-purpose R package for large-scale classification | accepted | Xiaolin Yang | EDITME |