Title: | Robust Angle Based Joint and Individual Variation Explained |
---|---|
Description: | A robust alternative to the aJIVE (angle based Joint and Individual Variation Explained) method (Feng et al 2018: <doi:10.1016/j.jmva.2018.03.008>) for the estimation of joint and individual components in the presence of outliers in multi-source data. It decomposes the multi-source data into joint, individual and residual (noise) contributions. The decomposition is robust to outliers and noise in the data. The method is illustrated in Ponzi et al (2021) <arXiv:2101.09110>. |
Authors: | Erica Ponzi [aut, cre], Abhik Ghosh [aut] |
Maintainer: | Erica Ponzi <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0 |
Built: | 2024-11-05 03:57:03 UTC |
Source: | https://github.com/ericaponzi/rajive |
Simulates blocks of data with joint and individual structures
ajive.data.sim( K = 3, rankJ = 2, rankA = c(20, 15, 10), n = 100, pks, dist.type = 1, noise = 1 )
ajive.data.sim( K = 3, rankJ = 2, rankA = c(20, 15, 10), n = 100, pks, dist.type = 1, noise = 1 )
K |
Integer. Number of data blocks. |
rankJ |
Integer. Joint rank. |
rankA |
Vector of Integers. Individual Ranks. |
n |
Integer. Number of data points. |
pks |
Vector of Integers. Number of variables in each block. |
dist.type |
Integer. 1 for normal, 2 for uniform, 3 for exponential |
noise |
Integer. Standard deviation in dist |
Xsim a list of simulated data matrices and true rank values
n <- 20 p1 <- 10 p2 <- 8 p3 <- 5 JrankTrue <- 2 initial_signal_ranks <- c(5, 2, 2) Y <- ajive.data.sim(K =3, rankJ = JrankTrue, rankA = initial_signal_ranks,n = n, pks = c(p1, p2, p3), dist.type = 1)
n <- 20 p1 <- 10 p2 <- 8 p3 <- 5 JrankTrue <- 2 initial_signal_ranks <- c(5, 2, 2) Y <- ajive.data.sim(K =3, rankJ = JrankTrue, rankA = initial_signal_ranks,n = n, pks = c(p1, p2, p3), dist.type = 1)
Visualization of the RaJIVE decomposition, it shows heatmaps of the decomposition obtained by RaJIVE
data_heatmap(data, show_color_bar = TRUE, title = "", xlab = "", ylab = "")
data_heatmap(data, show_color_bar = TRUE, title = "", xlab = "", ylab = "")
data |
List. The initial data blocks. |
show_color_bar |
Boolean. |
title |
Character. |
xlab |
Character. |
ylab |
Character |
Visualization of the RaJIVE decomposition, it shows heatmaps of the decomposition obtained by RaJIVE
decomposition_heatmaps_robustH(blocks, jive_results_robust)
decomposition_heatmaps_robustH(blocks, jive_results_robust)
blocks |
List. The initial data blocks. |
jive_results_robust |
List. The RaJIVE decomposition. |
The heatmap of the decomposition
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) decomposition_heatmaps_robustH(data.ajive, ajive.results.robust)
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) decomposition_heatmaps_robustH(data.ajive, ajive.results.robust)
Gets the block loadings from the Rajive decomposition
get_block_loadings(ajive_output, k, type)
get_block_loadings(ajive_output, k, type)
ajive_output |
List. The decomposition from Rajive |
k |
Integer. The index of the data block |
type |
Character. Joint or individual |
The block loadings
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_block_loadings(ajive.results.robust, 2, 'joint')
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_block_loadings(ajive.results.robust, 2, 'joint')
Gets the block scores from the Rajive decomposition
get_block_scores(ajive_output, k, type)
get_block_scores(ajive_output, k, type)
ajive_output |
List. The decomposition from Rajive |
k |
Integer. The index of the data block |
type |
Character. Joint or individual |
The block scores
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_block_scores(ajive.results.robust, 2, 'joint')
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_block_scores(ajive.results.robust, 2, 'joint')
Computes X = J + I + E for a single data block and the respective SVDs.
get_final_decomposition_robustH(X, joint_scores, sv_threshold, full = TRUE)
get_final_decomposition_robustH(X, joint_scores, sv_threshold, full = TRUE)
X |
Matrix. The original data matrix. |
joint_scores |
Matrix. The basis of the joint space (dimension n x joint_rank). |
sv_threshold |
Numeric vector. The singular value thresholds from the initial signal rank estimates. |
full |
Boolean. Do we compute the full J, I matrices or just svd |
Computes the individual matrix for a data block.
get_individual_decomposition_robustH( X, joint_scores, sv_threshold, full = TRUE )
get_individual_decomposition_robustH( X, joint_scores, sv_threshold, full = TRUE )
X |
Matrix. The original data matrix. |
joint_scores |
Matrix. The basis of the joint space (dimension n x joint_rank). |
sv_threshold |
Numeric vector. The singular value thresholds from the initial signal rank estimates. |
full |
Boolean. Do we compute the full J, I matrices or just the SVD (set to FALSE to save memory). |
Gets the individual ranks from the Rajive decomposition
get_individual_rank(ajive_output, k)
get_individual_rank(ajive_output, k)
ajive_output |
List. The decomposition from Rajive |
k |
Integer. The index of the data block. |
The individual ranks
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_individual_rank(ajive.results.robust, 2)
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_individual_rank(ajive.results.robust, 2)
Computes the individual matrix for a data block
get_joint_decomposition_robustH(X, joint_scores, full = TRUE)
get_joint_decomposition_robustH(X, joint_scores, full = TRUE)
X |
Matrix. The original data matrix. |
joint_scores |
Matrix. The basis of the joint space (dimension n x joint_rank). |
full |
Boolean. Do we compute the full J, I matrices or just the SVD (set to FALSE to save memory). |
Gets the joint rank from the Rajive decomposition
get_joint_rank(ajive_output)
get_joint_rank(ajive_output)
ajive_output |
List. The decomposition from Rajive |
The joint rank
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_joint_rank(ajive.results.robust)
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) get_joint_rank(ajive.results.robust)
Estimate the joint rank with the wedin bound, compute the signal scores SVD, double check each joint component.
get_joint_scores_robustH( blocks, block_svd, initial_signal_ranks, sv_thresholds, n_wedin_samples = 1000, n_rand_dir_samples = 1000, joint_rank = NA )
get_joint_scores_robustH( blocks, block_svd, initial_signal_ranks, sv_thresholds, n_wedin_samples = 1000, n_rand_dir_samples = 1000, joint_rank = NA )
blocks |
List. A list of the data matrices. |
block_svd |
List. The SVD of the data blocks. |
initial_signal_ranks |
Numeric vector. Initial signal ranks estimates. |
sv_thresholds |
Numeric vector. The singular value thresholds from the initial signal rank estimates. |
n_wedin_samples |
Integer. Number of wedin bound samples to draw for each data matrix. |
n_rand_dir_samples |
Integer. Number of random direction bound samples to draw. |
joint_rank |
Integer or NA. User specified joint_rank. If NA will be estimated from data. |
Samples from the random direction bound. Returns on the scale of squared singular value.
get_random_direction_bound_robustH(n_obs, dims, num_samples = 1000)
get_random_direction_bound_robustH(n_obs, dims, num_samples = 1000)
n_obs |
The number of observations. |
dims |
The number of features in each data matrix |
num_samples |
Integer. Number of vectors selected for resampling procedure. |
rand_dir_samples
Computes the singular value threshold for the data matrix (half way between the rank and rank + 1 singluar value).
get_sv_threshold(singular_values, rank)
get_sv_threshold(singular_values, rank)
singular_values |
Numeric. The singular values. |
rank |
Integer. The rank of the approximation. |
Computes the robust SVD of a matrix Using robRsvd
get_svd_robustH(X, rank = NULL)
get_svd_robustH(X, rank = NULL)
X |
Matrix. X matrix. |
rank |
Integer. Rank of SVD decomposition |
List. The SVD of X.
Gets the wedin bounds
get_wedin_bound_samples(X, SVD, signal_rank, num_samples = 1000)
get_wedin_bound_samples(X, SVD, signal_rank, num_samples = 1000)
X |
Matrix. The data matrix. |
SVD |
List. The SVD decomposition of the matrix. List with entries 'u', 'd', and 'v'from the svd function. |
signal_rank |
Integer. |
num_samples |
Integer. Number of vectors selected for resampling procedure. |
Computes the robust aJIVE decomposition with parallel computation.
Rajive( blocks, initial_signal_ranks, full = TRUE, n_wedin_samples = 1000, n_rand_dir_samples = 1000, joint_rank = NA )
Rajive( blocks, initial_signal_ranks, full = TRUE, n_wedin_samples = 1000, n_rand_dir_samples = 1000, joint_rank = NA )
blocks |
List. A list of the data matrices. |
initial_signal_ranks |
Vector. The initial signal rank estimates. |
full |
Boolean. Whether or not to store the full J, I, E matrices or just their SVDs (set to FALSE to save memory). |
n_wedin_samples |
Integer. Number of wedin bound samples to draw for each data matrix. |
n_rand_dir_samples |
Integer. Number of random direction bound samples to draw. |
joint_rank |
Integer or NA. User specified joint_rank. If NA will be estimated from data. |
The aJIVE decomposition.
n <- 50 pks <- c(100, 80, 50) Y <- ajive.data.sim(K =3, rankJ = 3, rankA = c(7, 6, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 6, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]]), (Y$sim_data[[3]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks)
n <- 50 pks <- c(100, 80, 50) Y <- ajive.data.sim(K =3, rankJ = 3, rankA = c(7, 6, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 6, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]]), (Y$sim_data[[3]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks)
Computes the robust SVD of a matrix
RobRSVD.all(data, nrank = min(dim(data)), svdinit = svd(data))
RobRSVD.all(data, nrank = min(dim(data)), svdinit = svd(data))
data |
Matrix. X matrix. |
nrank |
Integer. Rank of SVD decomposition |
svdinit |
List. The standard SVD. |
List. The SVD of X.
Gets the variance explained by each component of the Rajive decomposition
showVarExplained_robust(ajiveResults, blocks)
showVarExplained_robust(ajiveResults, blocks)
ajiveResults |
List. The decomposition from Rajive |
blocks |
List. The initial data blocks |
The proportion of variance explained by each component
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) showVarExplained_robust(ajive.results.robust, data.ajive)
n <- 10 pks <- c(20, 10) Y <- ajive.data.sim(K =2, rankJ = 2, rankA = c(7, 4), n = n, pks = pks, dist.type = 1) initial_signal_ranks <- c(7, 4) data.ajive <- list((Y$sim_data[[1]]), (Y$sim_data[[2]])) ajive.results.robust <- Rajive(data.ajive, initial_signal_ranks) showVarExplained_robust(ajive.results.robust, data.ajive)
Simulation of single data block from distribution
sim_dist(num, n, p)
sim_dist(num, n, p)
num |
Integer. Type of distribution. 1 for normal, 2 for uniform, 3 for exponential |
n |
Integer. Number of data points. |
p |
Integers. Number of variables in block. |
Computes UDV^T to get the approximate (or full) X matrix.
svd_reconstruction(decomposition)
svd_reconstruction(decomposition)
decomposition |
List. List with entries 'u', 'd', and 'v'from the svd function. |
Matrix. The original matrix.
Removes columns from the U, D, V matrix computed form an SVD.
truncate_svd(decomposition, rank)
truncate_svd(decomposition, rank)
decomposition |
List. List with entries 'u', 'd', and 'v'from the svd function. |
rank |
List. List with entries 'u', 'd', and 'v'from the svd function. |
The trucated robust SVD of X.
Resampling procedure for the wedin bound
wedin_bound_resampling(X, perp_basis, right_vectors, num_samples = 1000)
wedin_bound_resampling(X, perp_basis, right_vectors, num_samples = 1000)
X |
Matrix. The data matrix. |
perp_basis |
Matrix. Either U_perp or V_perp: the remaining left/right singluar vectors of X after estimating the signal rank. |
right_vectors |
Boolean. Right multiplication or left multiplication. |
num_samples |
Integer. Number of vectors selected for resampling procedure. |