Perform the inference of the maximum likelihood clonal tree from longitudinal data.

LACE(
  D,
  lik_w = NULL,
  alpha = NULL,
  beta = NULL,
  initialization = NULL,
  random_tree = FALSE,
  keep_equivalent = TRUE,
  check_indistinguishable = TRUE,
  num_rs = 50,
  num_iter = 10000,
  n_try_bs = 500,
  learning_rate = 1,
  marginalize = FALSE,
  error_move = FALSE,
  num_processes = Inf,
  seed = NULL,
  verbose = TRUE,
  log_file = "",
  show = TRUE
)

Arguments

D

Mutation data from multiple experiments for a list of driver genes. It can be either a list with a data matrix per time point or a SummarizedExperiment object. In this latter, the object must contain two fields: assays and colData. Assays stores one unique data matrix pooling all single cells observed at each time point and colData stores a vector of labels reporting the time point when each single cell was sequenced. Ordering of cells in assays field and colData field must be the same.

lik_w

Weight for each data point. If not provided, weights to correct for sample sizes are used.

alpha

False positive error rate provided as list of elements; if a vector of alpha (and beta) is provided, the inference is performed for multiple values and the solution at maximum-likelihood is returned.

beta

False negative error rate provided as list of elements; if a vector of beta (and alpha) is provided, the inference is performed for multiple values and the solution at maximum-likelihood is returned.

initialization

Binary matrix representing a perfect philogeny clonal tree; clones are rows and mutations are columns. This parameter overrides "random_tree".

random_tree

Boolean. Shall I start MCMC search from a random tree? If FALSE (default) and initialization is NULL, search is started from a TRaIT tree (BMC Bioinformatics . 2019 Apr 25;20(1):210. doi: 10.1186/s12859-019-2795-4).

keep_equivalent

Boolean. Shall I return results (B and C) at equivalent likelihood with the best returned solution?

check_indistinguishable

Boolean. Shall I remove any indistinguishable event from input data prior inference?

num_rs

Number of restarts during mcmc inference.

num_iter

Maximum number of mcmc steps to be performed during the inference.

n_try_bs

Number of steps without change in likelihood of best solution after which to stop the mcmc.

learning_rate

Parameter to tune the probability of accepting solutions at lower values during mcmc. Value of learning_rate = 1 (default), set a probability proportional to the difference in likelihood; values of learning_rate greater than 1 inclease the chance of accepting solutions at lower likelihood during mcmc while values lower than 1 decrease such probability.

marginalize

Boolean. Shall I marginalize C when computing likelihood?

error_move

Boolean. Shall I include estimation of error rates in the MCMC moves?

num_processes

Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL.

seed

Seed for reproducibility.

verbose

Boolean. Shall I print to screen information messages during the execution?

log_file

log file where to print outputs when using parallel. If parallel execution is disabled, this parameter is ignored.

show

Boolean. Show the interactive interface to explore the output.

Value

A list of 9 elements: B, C, clones_prevalence, relative_likelihoods, joint_likelihood, clones_summary and error_rates. Here, B returns the maximum likelihood longitudinal clonal tree, C the attachment of cells to clones, corrected_genotypes the corrected genotypes and clones_prevalence clones' prevalence; relative_likelihoods and joint_likelihood are respectively the likelihood of the solutions at each individual time points and the joint likelihood; clones_summary provide a summary of association of mutations to clones. In equivalent_solutions, solutions (B and C) with likelihood equivalent to the best solution are returned. Finally error_rates provides the best values of alpha and beta among the considered ones.

Examples

data(longitudinal_sc_variants)
inference = LACE(D = longitudinal_sc_variants,
                 lik_w = c(0.2308772,0.2554386,0.2701754,0.2435088),
                 alpha = list(c(0.10,0.05,0.05,0.05)),
                 beta = list(c(0.10,0.05,0.05,0.05)),
                 keep_equivalent = TRUE,
                 num_rs = 5,
                 num_iter = 10,
                 n_try_bs = 5,
                 num_processes = NA,
                 seed = 12345,
                 verbose = FALSE,
                 show = FALSE)