xestars()
provides a more efficient and lighter implementation
than xstars()
to select the combination of hyperparameters given to
coglasso()
yielding the most stable, yet sparse network. Stability is
computed upon network estimation from multiple subsamples of the multi-omics
data set, allowing repetition. Subsamples are collected for a fixed amount of
times (rep_num
), and with a fixed proportion of the total number of samples
(stars_subsample_ratio
).
Usage
xestars(
coglasso_obj,
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
light = TRUE,
verbose = TRUE
)
Arguments
- coglasso_obj
The object of
S3
classcoglasso
returned bycoglasso()
.- stars_thresh
The threshold set for variability of the explored networks at each iteration of the algorithm. The \(\lambda_w\) or the \(\lambda_b\) associated to the most stable network before the threshold is overcome is selected.
- stars_subsample_ratio
The proportion of samples in the multi-omics data set to be randomly subsampled to estimate the variability of the network under the given hyperparameters setting. Defaults to 80% when the number of samples is smaller than 144, otherwise it defaults to \(\frac{10}{n}\sqrt{n}\).
- rep_num
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20.
- max_iter
The greatest number of times the algorithm is allowed to choose a new best \(\lambda_w\). Defaults to 10.
- old_sampling
Perform the same subsampling
xstars()
would if set to TRUE. Makes a difference with bigger data sets, where computing a correlation matrix could take significantly longer. Defaults to FALSE.- light
Do not store the "merged" matrixes recording average variability of each edge, making the algorithm more memory efficient, if set to TRUE. Defaults to TRUE.
- verbose
Print information regarding the progress of the selection procedure on the console.
Value
xestars()
returns an object of S3
class select_coglasso
containing the results of the selection
procedure, built upon the object of S3
class coglasso
returned by coglasso()
.
... are the same elements returned by
coglasso()
.opt_adj
is a list of the adjacency matrices finally selected for each \(c\) parameter explored.opt_variability
is a numerical vector containing the variabilities associated to the adjacency matrices inopt_adj
.opt_index_lw
andopt_index_lb
are integer vectors containing the index of the selected \(\lambda_w\)s (or \(\lambda_b\)s) for each \(c\) parameters explored.opt_lambda_w
andopt_lambda_b
are vectors containing the selected \(\lambda_w\)s (or \(\lambda_b\)s) for each \(c\) parameters explored.sel_index_c
,sel_index_lw
andsel_index_lb
are the indexes of the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.sel_c
,sel_lambda_w
andsel_lambda_b
are the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.sel_adj
is the adjacency matrix of the final selected network.sel_density
is the density of the final selected network.sel_icov
is the inverse covariance matrix of the final selected network.call
is the matched call.method
is the chosen model selection method. Here, it is "xestars".merge_lw
andmerge_lb
are returned only iflight
is set to FALSE. They are lists with as many elements as the number of \(c\) parameters explored. Every element is a "merged" adjacency matrix, the average of all the adjacency matrices estimated for those specific \(c\) and the selected \(\lambda_w\) (or \(\lambda_b\)) values across all the subsampling in the last path explored before convergence, the one when the final combination of \(\lambda_w\) and \(\lambda_b\) is selected for the given \(c\) value.
Details
eXtended Efficient StARS (XEStARS) is a more efficient and memory-light version of
XStARS, the adaptation for collaborative graphical regression of the method
published by Liu, H. et al. (2010): Stability Approach to Regularization
Selection (StARS). StARS was developed for network estimation regulated by
a single penalty parameter, while collaborative graphical lasso needs to
explore three different hyperparameters. In particular, two of these are
penalty parameters with a direct influence on network sparsity, hence on
stability. For every \(c\) parameter, xestars()
explores one of the two
penalty parameters (\(\lambda_w\) or \(\lambda_b\)), keeping the other
one fixed at its previous best estimate, using the normal, one-dimentional
StARS approach, until finding the best couple. What makes it more efficient
than xstars()
is that the stability check that in the original algorithm
(even in the original StARS) is performed for every \(\lambda_w\) or
\(\lambda_b\) value, is implemented here as a stopping criterion. This
reduces sensibly the number of iterations before convergence. It then selects
the \(c\) parameter for which the best (\(\lambda_w\), \(\lambda_b\))
couple yielded the most stable, yet sparse network.
The original XStARS computes a new subsampling for every time the algorithm
switches from optimizing the two\(\lambda_w\) and \(\lambda_b\), and for
every \(c\). This does not allow to compare the hyperparameters on an equal
ground, and can slow the selection down with bigger data set or a larger
hyperparameter space. To allow a fairer (and faster) comparison among
different optimizations, the old_sampling
parameter has been implemented.
If set to TRUE, the subsampling is the same one xstars()
would perform.
Otherwise the subsampling is performed at the beginning of the algorithm once
and for all its iterations.
To allow xestars()
to be more memory light, the light
parameter has been
implemented. If set to TRUE and the "merged" matrixes traditionally returned
by both StARS and XStARS are not returned.