Link to general R GSoC 2013 page
Summary: hyperSpec is a package for working with spectra (like NMR, Infrared, Raman, Fluorescence, UV/VIS, X-ray diffraction) Links: http://hyperSpec.r-forge.r-project.org and hyperspec
Description: Areas of future development are:
parallelization & speed-up
file and data base import filters
plotting
specialized functions functions for working with spectra (stand alone tasks)
-
interface functions for conversion between hyperSpec and
Skills required: R programming, possibly C / C++
Basic knowledge of spectroscopy or related chemistry/physics is helpful.
Test: Check the code out from the svn repository at r-forge and contribute a function (including documentation & example) for one of the “stand-alone” tasks below.
hyperSpec has a bunch of vignettes that introduce you to the ideas & use of this package.
Feel free to contact me for further information / literature (also for a manuscript draft describing the package).
Mentor: Claudia Beleites Co-Mentors: Simon Fuller
— Claudia Beleites 2013/04/07
many calculations can easily be parallelized with parallel versions of apply, sweep, aggregate & Co.
modify apply so that rowSum & Co. can easily be used
some pre-processing methods should be suitable for GPU processing at 32bit precision
identify bottleneck functions and replace by C/C++
maybe data.table would be a fast alternative to the data.frames
Optmized BLAS are easily used in R and the parallel package is now part of core R. Key for successful parallelization of hyperSpec’s functions is a good strategy to decide
much nicer results
user can customize the plot object / add other things afterwards.
users are scared of writing lattice panel functions
grouping / faceting is needed
base plots allow adding, but axis labels etc. must be set immediately & grouping is close to impossible
estimate: plotspc spends probably > half of the code on handing through tons of possible settings.
plotspc needs cutting wavelength axes & stacked plotting
proof-of-concopt functions exist already
hyperSpec is thought to provide an infrastructure to work with spectra.
Thus, hyperSpec has a very general set-up. For now, no more than very basic locator () like interaction functions are available.
Acinonyx is very promising, but for the moment neither stable nor on CRAN. Therefore, the
GUI development has low priority at the moment.
However, most
GUI tasks have a user interface part and a “computation” task that should be handled separately (so that the calculation can be done e.g. in batch mode on a server without graphical interface). Calculation projects can be done now, and they are listed below.
most of these tasks are related to some stand-alone topics (see below)
interactive spike filtering for Raman spectra
align different spatially resolved measurements
microscope image & spectral measurements
this is important for the presentation of results of a data analysis
as well as for labelling spectra of heterogeneous samples, e.g. for classification:
| -> | |
| stained section with reference diagnosis (microscope image) | | class memberships of spatially resolved measurement |
Measurement
GUI: define grid with arbitrary outline
code partially available (Matlab)
outline:
grid options: square, hexagonal
measurement order: along x/y, comb / snake, random
pre-processing
offset & baseline correction, normalization, centering, etc.
should produce code that can be copied into scripts / Sweave documents
includes a number of the stand-alone tasks
-
Spectra fitting / library search
GUI
more than one spectra matrix can be in the object (for multi-way data)
other wavelength-related data can be attached to the columns of a spectra
Maybe even matrix decompostition data?
compatibility to other classes
speed up hyperSpec by using datatables instead of data.frames?
Fourier-transform smoothing/interpolation
blending spectra
ternary plotting for mixture diagrams (ggplot2)
plot a dot for each spectrum
should optionally allow points outside the triangle
plot hexbin / density / contour lines for the density
bi/trivariate coloring for plotmap
as if there were e.g. a red and a green fluorescence label
option: additive / subtractive color mixing
a basic implementation using ggplot2 exists
peak/spike finder
peak/band alignment
peak fitting
signal-to-noise-ratio calculations
single spectrum
repeated measurements
matrix/array method for apply that preserves dimensions
further methods, e.g. transform
a file import filter, see above