Efficient stability selection of the best coglasso network

xestars() provides a more efficient and lighter implementation than xstars() to select the combination of hyperparameters given to coglasso() yielding the most stable, yet sparse network. Stability is computed upon network estimation from multiple subsamples of the multi-omics data set, allowing repetition. Subsamples are collected for a fixed amount of times (rep_num), and with a fixed proportion of the total number of samples (stars_subsample_ratio).

Usage

xestars(
  coglasso_obj,
  stars_thresh = 0.1,
  stars_subsample_ratio = NULL,
  rep_num = 20,
  max_iter = 10,
  old_sampling = FALSE,
  light = TRUE,
  verbose = TRUE
)

Arguments

coglasso_obj: The object of S3 class coglasso returned by coglasso().
stars_thresh: The threshold set for variability of the explored networks at each iteration of the algorithm. The \(\lambda_w\) or the \(\lambda_b\) associated to the most stable network before the threshold is overcome is selected.
stars_subsample_ratio: The proportion of samples in the multi-omics data set to be randomly subsampled to estimate the variability of the network under the given hyperparameters setting. Defaults to 80% when the number of samples is smaller than 144, otherwise it defaults to \(\frac{10}{n}\sqrt{n}\).
rep_num: The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20.
max_iter: The greatest number of times the algorithm is allowed to choose a new best \(\lambda_w\). Defaults to 10.
old_sampling: Perform the same subsampling xstars() would if set to TRUE. Makes a difference with bigger data sets, where computing a correlation matrix could take significantly longer. Defaults to FALSE.
light: Do not store the "merged" matrixes recording average variability of each edge, making the algorithm more memory efficient, if set to TRUE. Defaults to TRUE.
verbose: Print information regarding the progress of the selection procedure on the console.

Value

xestars() returns an object of S3 class select_coglasso containing the results of the selection procedure, built upon the object of S3 class coglasso returned by coglasso().

... are the same elements returned by coglasso().
opt_adj is a list of the adjacency matrices finally selected for each \(c\) parameter explored.
opt_variability is a numerical vector containing the variabilities associated to the adjacency matrices in opt_adj.
opt_index_lw and opt_index_lb are integer vectors containing the index of the selected \(\lambda_w\)s (or \(\lambda_b\)s) for each \(c\) parameters explored.
opt_lambda_w and opt_lambda_b are vectors containing the selected \(\lambda_w\)s (or \(\lambda_b\)s) for each \(c\) parameters explored.
sel_index_c, sel_index_lw and sel_index_lb are the indexes of the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.
sel_c, sel_lambda_w and sel_lambda_b are the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.
sel_adj is the adjacency matrix of the final selected network.
sel_density is the density of the final selected network.
sel_icov is the inverse covariance matrix of the final selected network.
call is the matched call.
method is the chosen model selection method. Here, it is "xestars".
merge_lw and merge_lb are returned only if light is set to FALSE. They are lists with as many elements as the number of \(c\) parameters explored. Every element is a "merged" adjacency matrix, the average of all the adjacency matrices estimated for those specific \(c\) and the selected \(\lambda_w\) (or \(\lambda_b\)) values across all the subsampling in the last path explored before convergence, the one when the final combination of \(\lambda_w\) and \(\lambda_b\) is selected for the given \(c\) value.

Details

eXtended Efficient StARS (XEStARS) is a more efficient and memory-light version of XStARS, the adaptation for collaborative graphical regression of the method published by Liu, H. et al. (2010): Stability Approach to Regularization Selection (StARS). StARS was developed for network estimation regulated by a single penalty parameter, while collaborative graphical lasso needs to explore three different hyperparameters. In particular, two of these are penalty parameters with a direct influence on network sparsity, hence on stability. For every \(c\) parameter, xestars() explores one of the two penalty parameters (\(\lambda_w\) or \(\lambda_b\)), keeping the other one fixed at its previous best estimate, using the normal, one-dimentional StARS approach, until finding the best couple. What makes it more efficient than xstars() is that the stability check that in the original algorithm (even in the original StARS) is performed for every \(\lambda_w\) or \(\lambda_b\) value, is implemented here as a stopping criterion. This reduces sensibly the number of iterations before convergence. It then selects the \(c\) parameter for which the best (\(\lambda_w\), \(\lambda_b\)) couple yielded the most stable, yet sparse network.
The original XStARS computes a new subsampling for every time the algorithm switches from optimizing the two\(\lambda_w\) and \(\lambda_b\), and for every \(c\). This does not allow to compare the hyperparameters on an equal ground, and can slow the selection down with bigger data set or a larger hyperparameter space. To allow a fairer (and faster) comparison among different optimizations, the old_sampling parameter has been implemented. If set to TRUE, the subsampling is the same one xstars() would perform. Otherwise the subsampling is performed at the beginning of the algorithm once and for all its iterations.
To allow xestars() to be more memory light, the light parameter has been implemented. If set to TRUE and the "merged" matrixes traditionally returned by both StARS and XStARS are not returned.

Examples

cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3, 
               nlambda_b = 3, nc = 3, verbose = FALSE)
# \donttest{
# Takes less than five seconds
sel_cg <- xestars(cg, verbose = FALSE)
# }