This document lists the input parameters expected / accepted in the
CALANGO definition files (or, alternatively, in the defs
list).
Type: character string
Description: path to the directory where annotation files are located
Required: YES
Default: none
Type: character string
Description: path to the output directory where results should be saved
Required: YES
Default: none
Type: character string
Description: path to a file containing the genome metadata. It should contain at least, for each genome: (1) path for annotation data; (2) phenotype data (numeric); (3) normalization data (numeric) It must be a tab-separated value file with no column headers.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file
specified in dataset.info
containing the phenotype data,
which will be used to sort the genomes and find annotation terms
associated to that phenotype.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file
specified in dataset.info
containing the short names for
species/lineages to be used when plotting data.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file
specified in dataset.info
containing the group to be used
for coloring the heatmaps
Required: YES
Default: none
Type: character string.
Description: which dictionary data type to use? Accepts “GO” or “other”
Required: YES
Default: none
Type: character string
Description: path to dictionary file (a two-column
tab-separated value file containing annotation IDs and their
descriptions). Not needed if ontology = "GO"
.
Required: NO
Default: none
Type: character string
Description: the name of the column in the annotation file that should be used.
Required: YES
Default: none
Type: integer/numeric
Description: index of the column from the file
specified in dataset.info
containing the normalization
data.
Required: NO
Default: none
Type: character string
Description: tree file type. Accepts “nexus” or “newick”. Case-sensitive.
Required: YES
Default: none
Type: character string
Description: type of analysis to perform. Currently accepts only “correlation”
Required: YES
Default: none
Type: character string
Description: type of multiple hypothesis testing
correction to apply. Accepts all methods listed in
stats::p.adjust.methods
.
Required: NO
Default: “BH”
Cutoffs are used to regulate how much graphical output is produced by CALANGO. The tab-separated value files that are generated at the end of the analysis (and saved in the output.dir) will always contain all, unfiltered results.
q-value cutoffs are used for correlation and phylogeny-aware linear models. Only entries with q-values smaller than these cutoffs will be shown.
Type: numeric between 0 and 1
Required: NO
Default: 1
correlation cutoffs are used to establish thresholds of positive/negative correlation values for the graphical output. Important: these parameters are a bit counter-intuitive. Please check the example below for clarity.
Type: numeric values between 0 and 1
Description: Thresholds for Spearman correlation values. The selection criteria is: (Spearman correlation < lower.cutoff) OR (Spearman correlation > upper.cutoff)
Required: NO
Defaults:
spearman.cor.upper.cutoff = -1
;
spearman.cor.lower.cutoff = 1
(i.e., no filtering)
Example 1: If you set
spearman.cor.upper.cutoff = 0.8
and
spearman.cor.lower.cutoff = -0.8
, only pairs with Spearman
correlation values smaller than -0.8
OR greater than
0.8
will be shown.
Example 2: If you set
spearman.cor.upper.cutoff = 0
and
spearman.cor.lower.cutoff = -1
, pairs with Spearman
correlation values smaller than -1
OR greater than
0
will be shown. Since the Spearman correlation cannot be
smaller than -1
, this means that only positively correlated
pairs will be shown.
Example 3: If you set any values such that
spearman.cor.upper.cutoff < spearman.cor.lower.cutoff
,
all pairs are shown (no filtering is performed).
Type: numeric values between 0 and 1
Description: Thresholds for Pearson correlation values. The selection criteria is: (Pearson correlation < lower.cutoff) OR (Pearson correlation > upper.cutoff)
Required: NO
Defaults:
pearson.cor.upper.cutoff = -1
;
pearson.cor.lower.cutoff = 1
(i.e., no filtering)
Type: numeric values between 0 and 1
Description: Thresholds for Kendall correlation values. The selection criteria is: (Kendall correlation < lower.cutoff) OR (Kendall correlation > upper.cutoff)
Required: NO
Defaults:
kendall.cor.upper.cutoff = -1
;
kendall.cor.lower.cutoff = 1
(i.e., no filtering)
standard deviation and coefficient of variation cutoffs (only values greater than cutoff will be shown)
Type: non-negative numeric value
Required: NO
Default: 0
sum of annotation terms cutoff (only values greater than cutoff will be shown)
Type: non-negative integer/numeric value
Required: NO
Default: 0
prevalence and heterogeneity cutoffs (only values greater than cutoff will be shown). Prevalence is defined as the percentage of lineages where annotation term was observed at least once. Heterogeneity is defined as the percentage of lineages where annotation term count is different from the median.
Type: character string. Accepts “TRUE” or “FALSE”
Description: If “TRUE” all annotation terms where standard deviation for annotation raw values before normalization is zero are removed. This filter is used to remove the (quite common) bias when QPAL (phenotype) and normalizing factors are strongly associated by chance.
Required: YES
Default: “TRUE”