Use copulas to transfer all omics layers to the normal realm
Usage
copulize(
layers,
p = NULL,
omics = NULL,
marginals = NULL,
noninv_method = NULL,
copula = NULL,
verbose = FALSE
)Arguments
- layers
The omics layers to analyze. They can be provided in three possible formats, and they should always contain source-matched, non-normalized data, to use carmon to its full potential.
A named R list of omics data sets (recommended). If possible the names of the list should correspond to the respective omics type. If not possible, for example because of two layers from the same technology, please provide the omics types with the parameter
omics. To see a list of available omics types use the functionwhich_omics().
Each data set should be source-matched (same amount of matched samples or individuals across each data set). Placing of the samples (or individuals) should also be consistent: either along the rows for all the data sets, or along the columns for all the data sets, nothing in between. All the samples (or individuals) should also have consistent naming across the data sets.An object of
S4classMultiAssayExperiment(more here). In that case, we recommend using theomicsparameter to specify which are the omics layers contained in the object, in the same order as presented byMultiAssayExperiment::experiments(layers). This allows to use carmon to its full potential. To see a list of terms and omics technologies for which carmon is specifically tailored, use the functionwhich_omics(). Please remember thatcarmon()expects data to be non-normalized, meaning that for RNA-seq data, for example, it will expect data to be in the form of counts.A single unified data set, but then it is necessary to specify the argument
p. We also recommend using theomicsparameter to specify which are the omics layers contained in the data set.
- p
Optional, to be specified only in case layers is a single data set. A vector with with the number of variables for each omic layer of the data set (e.g. the number of transcripts, metabolites etc.), in the same order the layers have in the data set. If given a single number, carmon assumes that the total of data sets is two, and that the number given is the dimension of the first one.
- omics
Highly recommended. A vector of as many elements as the number of layers, naming what omics each layer contains, in the same order as provided in the input
layers(e.g.omics = c('RNA-seq', 'proteomics', 'metabolomics')). To see a list of terms and omics technologies for which carmon is specifically tailored, use the functionwhich_omics().- marginals
Optional, to be specified when the user prefers to use different marginal distributions than the default distribution carmon tailored for each omics layers. A vector of as many elements as the number of layers, specifying which marginal distribution should be used for each omics layer, in the same order as provided in the input
layers. To see a list of available marginal distributions, use the functionwhich_marginals(). For a more custom setting, place a0in the vector in the position corresponding to the omics layers for which the default distribution is desired. Otherwise, specify the desired marginal distribution.- noninv_method
A placeholder for future functionalities of carmon, do not use.
- copula
A placeholder for future functionalities of carmon, do not use.
- verbose
The level of verbosity of the copulization process.
0suppresses the information output, while1and2give progressively increasing amounts of information about the inner computations happening insidecopulize().
Value
copulize() returns an object of S3 class carmon_cop, having
the following elements:
layersis an R list, each element being a data set of the corresponding layer, already copulized and transferred to the normal realm.omicsis a vector containing the omics type assigned to each omics layer.marginalsis a vector containing the marginal distributions used to transfer each omics layer to the normal realm.copulize_callis the matched call.
Examples
# To apply the copula-based transition to the normal realm, it is sufficient
# to provide the input data as a named R list, with each element being the
# data set of an omics layer.
data(multi_omics_micro)
copulized <- copulize(multi_omics_micro, verbose = FALSE)
