ESEL Paper Review_20110126 by Aamir Alaud-din

1. Title and Author

Title: 2011_01_08_The N-way Toolbox for MATLAB

Authors: Claus A. Andersson*, Rasmus Bro

*Department of Dairy and Food Science ? Food Technology, Chemometrics Group, The Royal Veterinary and Agricultural University, Rolighedsvej 30, DK-1958 Frederiksberg, Denmark

Department of Dairy and Food Science ? Food Technology, Chemometrics Group, The Royal Veterinary and Agricultural University, Rolighedsvej 30, DK-1958 Frederiksberg, Denmark

2. Summary of Paper

The N-way toolbox is a freely available toolbox for MATLAB®. It is a collection of functions and algorithms for modeling multiway data. It includes several models like canonical decomposition-parallel factor analysis (CANDECOMP-PARAFAC), partial least squares regression (PLSR), generalized rank annihilation method (GRAM), direct trilinear decomposition (DTLD) and Tucker models. The constraints like nonnegativity, unimodality, and orthogonality are built in to minimize errors in the least squares for CANDECOMP-PARAFAC.

This paper discusses the following three models:

1. The CANDECOMP-PARAFAC Model

2. The multilinear PLS Regression Algorithm

3. Tucker Model

1. The CANDECOMP-PARAFAC Model

This model was suggested in 1970. For a three way data array X(I×J×K), the three-way CANDECOMP-PARAFAC model is defined by the following equation:

CANDECOMP-PARAFAC model can be fitted in a least square sense under optional nonnegativity, unimodality and orthogonality constraints in the components.

GRAM and DTLD models are not least squares but their structures are similar to CANDECOMP-PARAFAC. They are noniterative and are therefore faster than PARAFAC.

2. The multilinear PLS Regression Algorithm

The trilinear and multilinear algorithms are straightforward extensions of the PLS algorithm. The multiway data is arranged into matrices in multiway PLS regression. This way, no knowledge of multiway structure is used in the process of decomposition. This leads to less transparent and predictive models when a multilinear structure can be a good approximation of the data.

3. Tucker Models

Tucker model was proposed in 1963. It is a direct extension of ordinary two-way principal component analysis (PCA). For components (P,Q,R) in the 1st, 2nd and 3rd mode, Tucker model can be described as:

In contrast to CANDECOMP-PARAFAC model, all tucker models suffer from rotational ambiguity as they rotate the component matrices and counter rotate the core array.

Data to be used in the toolbox accompany the toolbox. They include different kinds of multiway data. For example, fluorescence process data includes trilinear technique whereas the sensory data is far from trilinear technique and also contains much noise.

4. Contribution to ESEL

This paper includes three different models for multiway data. A difference between the models is also discussed. The other two models and specially Tucker model can be helpful to make PARAFAC model more robust and powerful.

By: Aamir Alaud-din

aamiralauddin@gist.ac.kr