bs() wraps the two main functions of the package in a single one:
coglasso(), to build multiple multi-omics networks, and select_coglasso()
to select the best one according to the chosen criterion.
Usage
bs(
data,
p = NULL,
pX = lifecycle::deprecated(),
lambda_w = NULL,
lambda_b = NULL,
c = NULL,
nlambda_w = NULL,
nlambda_b = NULL,
nc = NULL,
lambda_w_max = NULL,
lambda_b_max = NULL,
c_max = NULL,
lambda_w_min_ratio = NULL,
lambda_b_min_ratio = NULL,
c_min = NULL,
icov_guess = NULL,
cov_output = FALSE,
lock_lambdas = FALSE,
method = "xestars",
stars_thresh = 0.1,
stars_subsample_ratio = NULL,
rep_num = 20,
max_iter = 10,
old_sampling = FALSE,
ebic_gamma = 0.5,
verbose = TRUE
)Arguments
- data
The input multi-omics data set. Rows should be samples, columns should be variables. Variables should be grouped by their assay (e.g. transcripts first, then metabolites).
datais a required parameter.- p
A vector with with the number of variables for each omic layer of the data set (e.g. the number of transcripts, metabolites etc.), in the same order the layers have in the data set. If given a single number,
coglasso()assumes that the total of data sets is two, and that the number given is the dimension of the first one.- pX
- lambda_w
A vector of values for the parameter \(\lambda_w\), the penalization parameter for the "within" interactions. Overrides
nlambda_w.- lambda_b
A vector of values for the parameter \(\lambda_b\), the penalization parameter for the "between" interactions. Overrides
nlambda_b.- c
A vector of values for the parameter \(c\), the weight given to collaboration. Overrides
nc.- nlambda_w
The number of requested \(\lambda_w\) parameters to explore. A sequence of size
nlambda_wof \(\lambda_w\) parameters will be generated. Defaults to 8. Ignored whenlambda_wis set by the user.- nlambda_b
The number of requested \(\lambda_b\) parameters to explore. A sequence of size
nlambda_bof \(\lambda_b\) parameters will be generated. Defaults to 8. Ignored whenlambda_bis set by the user.- nc
The number of requested \(c\) parameters to explore. A sequence of size
ncof \(c\) parameters will be generated. Defaults to 5. Ignored whencis set by the user.- lambda_w_max
The greatest generated \(\lambda_w\). By default it is computed with a data-driven approach. Ignored when
lambda_wis set by the user.- lambda_b_max
The greatest generated \(\lambda_b\). By default it is computed with a data-driven approach. Ignored when
lambda_bis set by the user.- c_max
The greatest \(c\) explored. Defaults to 100. Ignored when
cis set by the user.- lambda_w_min_ratio
The ratio of the smallest generated \(\lambda_w\) over the greatest generated \(\lambda_w\). Defaults to 0.1. Ignored when
lambda_wis set by the user.- lambda_b_min_ratio
The ratio of the smallest generated \(\lambda_b\) over the greatest generated \(\lambda_b\). Defaults to 0.1. Ignored when
lambda_bis set by the user.- c_min
The the smallest \(c\) explored. Defaults to \(\frac{1}{c_{max}}\), so to 0.01 if
c_maxis not set by the user. Ignored whencis set by the user.- icov_guess
Use a predetermined inverse covariance matrix as an initial guess for the network estimation.
- cov_output
Add the estimated variance-covariance matrix to the output.
- lock_lambdas
Set \(\lambda_w = \lambda_b\). Force a single lambda parameter for both "within" and "between" interactions.
- method
The model selection method to select the best combination of hyperparameters. The available options are "xstars", "xestars" and "eBIC". Defaults to "xestars".
- stars_thresh
The threshold set for variability of the explored networks at each iteration of the algorithm. The \(\lambda_w\) or the \(\lambda_b\) associated to the most stable network before the threshold is overcome is selected.
- stars_subsample_ratio
The proportion of samples in the multi-omics data set to be randomly subsampled to estimate the variability of the network under the given hyperparameters setting. Defaults to 80% when the number of samples is smaller than 144, otherwise it defaults to \(\frac{10}{n}\sqrt{n}\).
- rep_num
The amount of subsamples of the multi-omics data set used to estimate the variability of the network under the given hyperparameters setting. Defaults to 20.
- max_iter
The greatest number of times the algorithm is allowed to choose a new best \(\lambda_w\). Defaults to 10.
- old_sampling
Perform the same subsampling
xstars()would if set to TRUE. Makes a difference with bigger data sets, where computing a correlation matrix could take significantly longer. Defaults to FALSE.- ebic_gamma
The \(\gamma\) tuning parameter for eBIC selection, to set between 0 and 1. When set to 0 one has the standard BIC. Defaults to 0.5.
- verbose
Print information regarding the network building and the network selection processes.
Value
bs() returns an object of S3 class select_coglasso containing
several elements. The most
important is probably sel_adj, the adjacency matrix of the
selected network. Some output elements depend on the chosen model selection
method.
These elements are always returned, and they are the result of network
estimation with coglasso():
loglikis a numerical vector containing the \(log\) likelihoods of all the estimated networks.densityis a numerical vector containing a measure of the density of all the estimated networks.dfis an integer vector containing the degrees of freedom of all the estimated networks.convergenceis a binary vector containing whether a network was successfully estimated for the given combination of hyperparameters or not.pathis a list containing the adjacency matrices of all the estimated networks.icovis a list containing the inverse covariance matrices of all the estimated networks.nexplodedis the number of combinations of hyperparameters for whichcoglasso()failed to converge.datais the input multi-omics data set.hparsis the ordered table of all the combinations of hyperparameters given as input tobs(), with \(\alpha(\lambda_w+\lambda_b)\) being the key to sort rows.lambda_w,lambda_b, andcare numerical vectors with, respectively, all the \(\lambda_w\), \(\lambda_b\), and \(c\) valuesbs()used.pis the vector with the number of variables for each omic layer of the data set.Dis the number of omics layers in the data set.covoptional, returned whencov_outputis TRUE, is a list containing the variance-covariance matrices of all the estimated networks.
These elements are returned by all selection methods available:
sel_index_c,sel_index_lwandsel_index_lbare the indexes of the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.sel_c,sel_lambda_wandsel_lambda_bare the final selected parameters \(c\), \(\lambda_w\) and \(\lambda_b\) leading to the most stable sparse network.sel_adjis the adjacency matrix of the final selected network.sel_densityis the density of the final selected network.sel_icovis the inverse covariance matrix of the final selected network.sel_covoptional, given only whencoglasso()was called withcov_output = TRUE. It is the covariance matrix associated with the final selected network.callis the matched call.methodis the chosen model selection method.
These are the additional elements returned when choosing "xestars" or "xstars":
mergeis the "merged" adjacency matrix, the average of all the adjacency matrices estimated across all the different subsamples for the selected combination of \(\lambda_w\), \(\lambda_b\), and \(c\) values in the last path explored before convergence. Each entry is a measure of how recurrent the corresponding edge is across the subsamples.variability_lw,variability_lbandvariability_care numeric vectors of as many items as the number of \(\lambda_w\), \(\lambda_b\), and \(c\) values explored. Each item is the variability of the network estimated for the corresponding hyperparameter value, keeping the other two hyperparameters fixed to their selected value.sel_variabilityis the variability of the final selected network.
These are the additional elements returned when choosing "ebic":
ebic_scoresis a numerical vector containing the eBIC scores for all the hyperparameter combination.
Details
When using bs(), first, coglasso() estimates multiple multi-omics networks
with the algorithm collaborative graphical lasso, one for each combination
of input values for the hyperparameters \(\lambda_w\), \(\lambda_b\) and
\(c\). Then, select_coglasso() selects the best combination of
hyperparameters given to coglasso() according to the selected model
selection method. The three availble options that can be set for the argument
method are "xstars", "xestars" and "ebic". For more information on these
selection methods, visit the help page of select_coglasso().
Examples
# Suggested usage: give the input data set, set the values for `p` and the
# number of hyperparameters to explore (to choose how extensively to explore
# the possible hyperparameters). Then, let the default behavior do the rest:
sel_mo_net <- bs(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3,
nlambda_b = 3, nc = 3, verbose = FALSE)
