This functions allows for the calculation and visualizations of various overlap metrics for clones. The methods include overlap coefficient (overlap), Morisita's overlap index (morisita), Jaccard index (jaccard), cosine similarity (cosine) or the exact number of clonal overlap (raw).
clonalOverlap(
input.data,
cloneCall = "strict",
method = NULL,
chain = "both",
group.by = NULL,
order.by = NULL,
exportTable = FALSE,
palette = "inferno"
)
The product of combineTCR()
,
combineBCR()
, or combineExpression()
How to call the clone - VDJC gene (gene), CDR3 nucleotide (nt), CDR3 amino acid (aa), VDJC gene + CDR3 nucleotide (strict) or a custom variable in the data
The method to calculate the "overlap", "morisita", "jaccard", "cosine" indices or "raw" for the base numbers
indicate if both or a specific chain should be used - e.g. "both", "TRA", "TRG", "IGH", "IGL"
The variable to use for grouping
A vector of specific plotting order or "alphanumeric" to plot groups in order
Returns the data frame used for forming the graph
Colors to use in visualization - input any hcl.pals
ggplot of the overlap of clones by group
The formulas for the indices are as follows:
Overlap Coefficient: $$overlap = \frac{\sum \min(a, b)}{\min(\sum a, \sum b)}$$
Raw Count Overlap: $$raw = \sum \min(a, b)$$
Morisita Index: $$morisita = \frac{\sum a b}{(\sum a)(\sum b)}$$
Jaccard Index: $$jaccard = \frac{\sum \min(a, b)}{\sum a + \sum b - \sum \min(a, b)}$$
Cosine Similarity: $$cosine = \frac{\sum a b}{\sqrt{(\sum a^2)(\sum b^2)}}$$
Where:
\(a\) and \(b\) are the abundances of species \(i\) in groups A and B, respectively.
#Making combined contig data
combined <- combineTCR(contig_list,
samples = c("P17B", "P17L", "P18B", "P18L",
"P19B","P19L", "P20B", "P20L"))
clonalOverlap(combined,
cloneCall = "aa",
method = "jaccard")