Estimate networks from a multi-omics data set

coglasso() estimates multiple multi-omics networks with the algorithm collaborative graphical lasso, one for each combination of input values for the hyperparameters \(\lambda_w\), \(\lambda_b\) and \(c\).

Usage

coglasso(
  data,
  p = NULL,
  pX = lifecycle::deprecated(),
  lambda_w = NULL,
  lambda_b = NULL,
  c = NULL,
  nlambda_w = NULL,
  nlambda_b = NULL,
  nc = NULL,
  lambda_w_max = NULL,
  lambda_b_max = NULL,
  c_max = NULL,
  lambda_w_min_ratio = NULL,
  lambda_b_min_ratio = NULL,
  c_min_ratio = NULL,
  icov_guess = NULL,
  cov_output = FALSE,
  lock_lambdas = FALSE,
  verbose = TRUE
)

Arguments

data: The input multi-omics data set. Rows should be samples, columns should be variables. Variables should be grouped by their assay (e.g. transcripts first, then metabolites). data is a required parameter.
p: A vector with with the number of variables for each omic layer of the data set (e.g. the number of transcripts, metabolites etc.), in the same order the layers have in the data set. If given a single number, coglasso() assumes that the total of data sets is two, and that the number given is the dimension of the first one.
pX: pX is no longer supported. Please use p.
lambda_w: A vector of values for the parameter \(\lambda_w\), the penalization parameter for the "within" interactions. Overrides nlambda_w.
lambda_b: A vector of values for the parameter \(\lambda_b\), the penalization parameter for the "between" interactions. Overrides nlambda_b.
c: A vector of values for the parameter \(c\), the weight given to collaboration. Overrides nc.
nlambda_w: The number of requested \(\lambda_w\) parameters to explore. A sequence of size nlambda_w of \(\lambda_w\) parameters will be generated. Defaults to 8. Ignored when lambda_w is set by the user.
nlambda_b: The number of requested \(\lambda_b\) parameters to explore. A sequence of size nlambda_b of \(\lambda_b\) parameters will be generated. Defaults to 8. Ignored when lambda_b is set by the user.
nc: The number of requested \(c\) parameters to explore. A sequence of size nc of \(c\) parameters will be generated. Defaults to 8. Ignored when c is set by the user.
lambda_w_max: The greatest generated \(\lambda_w\). By default it is computed with a data-driven approach. Ignored when lambda_w is set by the user.
lambda_b_max: The greatest generated \(\lambda_b\). By default it is computed with a data-driven approach. Ignored when lambda_b is set by the user.
c_max: The greatest generated \(c\). Defaults to 10. Ignored when c is set by the user.
lambda_w_min_ratio: The ratio of the smallest generated \(\lambda_w\) over the greatest generated \(\lambda_w\). Defaults to 0.1. Ignored when lambda_w is set by the user.
lambda_b_min_ratio: The ratio of the smallest generated \(\lambda_b\) over the greatest generated \(\lambda_b\). Defaults to 0.1. Ignored when lambda_b is set by the user.
c_min_ratio: The ratio of the smallest generated \(c\) over the greatest generated \(c\). Defaults to 0.1. Ignored when c is set by the user.
icov_guess: Use a predetermined inverse covariance matrix as an initial guess for the network estimation.
cov_output: Add the estimated variance-covariance matrix to the output.
lock_lambdas: Set \(\lambda_w = \lambda_b\). Force a single lambda parameter for both "within" and "between" interactions.
verbose: Print information regarding current coglasso run on the console.

Value

coglasso() returns an object of S3 class coglasso, that has the following elements:

loglik is a numerical vector containing the \(log\) likelihoods of all the estimated networks.
density is a numerical vector containing a measure of the density of all the estimated networks.
df is an integer vector containing the degrees of freedom of all the estimated networks.
convergence is a binary vector containing whether a network was successfully estimated for the given combination of hyperparameters or not.
path is a list containing the adjacency matrices of all the estimated networks.
icov is a list containing the inverse covariance matrices of all the estimated networks.
nexploded is the number of combinations of hyperparameters for which coglasso() failed to converge.
data is the input multi-omics data set.
hpars is the ordered table of all the combinations of hyperparameters given as input to coglasso(), with \(\alpha(\lambda_w+\lambda_b)\) being the key to sort rows.
lambda_w is a numerical vector with all the \(\lambda_w\) values coglasso() used.
lambda_b is a numerical vector with all the \(\lambda_b\) values coglasso() used.
c is a numerical vector with all the \(c\) values coglasso() used.
p is the vector with the number of variables for each omic layer of the data set.
D is the number of omics layers in the data set.
icov_guess optional, returned when icov_guess is given. It is the predetermined inverse covariance matrix given by the user as an initial guess for the network estimation.
cov optional, returned when cov_output is TRUE, is a list containing the variance-covariance matrices of all the estimated networks.
call is the matched call.

Examples

# Typical usage: set the number of hyperparameters to explore
cg <- coglasso(multi_omics_sd_micro, p = c(4, 2), nlambda_w = 3, 
               nlambda_b = 3, nc = 3, verbose = FALSE)
# \donttest{
# Model selection using eXtended Efficient StARS, takes less than five seconds
sel_cg_xestars <- select_coglasso(cg, method = "xestars", verbose = FALSE)
# }