Multiple Factor Analysis (MFA) - demystify dimensionality reduction techniques (4)
As its name indicates, multiple factor analysis (MFA) is a factorial analysis in which multiple sets of variables describing the same set of individuals are considered. In short, it integrates different groups of variables by weighting them differently and aggregating them into a global data matrix to perform factor analysis (FA) on.
Here I will briefly describe how MFA works and the various types of visualizations that can be used to display the relationships between individuals and variables.
How MFA works
Suppose there are T data matrices, each corresponding to the observations of a set of variables on the same set of individuals (objects).
Each data matrix is of size I×Jₜ, where I is the number of individuals, and Jₜ is the number of variables in the t-th set. Denote the processed (e.g., standardized) data matrices as Xₜ’s.
The first step is to weigh different sets of variables. We perform PCA for each Xₜ matrix, then normalize it with its first eigenvalue. Denote the normalized matrix as Zₜ. The normalization, or the weighting, ensures that the largest eigenvalues of Zₜ’s all equal to 1.
Next, the normalized data matrices are concatenated together into a large global data matrix Z
The Z matrix is of size I×J, where
A global PCA is performed on Z.
Denote the mass of each individual as mᵢ, where mᵢ=1/I if individuals weigh equally. Let M=diag(mᵢ) be a diagonal matrix of size I×I.
Suppose SVD on Z yields
then the global factor scores can be obtained as
Each row of matrix F corresponds to an individual, and the columns are the components.
This factor score matrix can also be thought of as projecting the data onto a global space
via the projection matrix
Using this project matrix, the projection of each data matrix onto the global space can be obtained separately.
The partial inertia of each dataset for each dimension of the global analysis equals the sum of the squared projections of the variables on V, the right singular vector of Z, multiplied by the corresponding eigenvalue. The sum of partial inertias for all the studies for a given dimension should equal the corresponding eigenvalue to that dimension.
Visualizations of the relationships
I will leave the example graphics for a future note in which I will discuss using Python/R to perform various dimensionality reduction techniques. Here I simply want to list a few commonly used types of visualizations for presenting the MFA results. They help show the relationships between individuals, the relationships between variables, the relationship between variable sets, the relationships between individuals by each variable set, etc.
- Individuals in the global space (e.g., PC1 vs. PC2)
- Correlation circle of variables (how much the original variables are correlated with the first two PCs)
- Projected partial inertia of each dataset on the first two PCs
- Individuals in the global space with contributions from each dataset
There are a large variety of visualizations for MFA, some adapting the same ideas from visualizations for PCA or FA. I will include some examples in my future notes.
Reference:
Abdi, H., & Valentin, D. (2007). Multiple factor analysis (MFA). Encyclopedia of measurement and statistics, 657–663.
Here are my other posts in this series if you are interested:
- Linear algebra review & PCA: Demystify dimensionality reduction techniques (1): Principal Component Analysis
- PCoA & MDS: Demystify dimensionality reduction techniques (2): PCoA & Multidimensional Scaling
- FA: Demystify dimensionality reduction techniques (3): Factor Analysis
- Practical dimensionality reduction in R and Python: TBC