vignettes/articles/Clonal_Diversity.Rmd
Clonal_Diversity.Rmd
Diversity can also be measured for samples or by other variables. Diversity metrics calculated, include: “shannon”, “inv.simpson”, “norm.entropy”, “gini.simpson”, “chao1”, and “ACE”. Please see the manual for more information on each metric and the underlying calculations.
Inherent in diversity calculations is a bias for increasing diversity
with increasing repertoire size. clonalDiversity()
will
automatically downsample to the smallest repertoire size and perform
bootstrapping to return the mean diversity estimates. If the output of
diversity values are strange or minimally variable, it is likely due to
a sample with small repertoire size.
n.boots
The number of calculations to perform (default =
100).
return.boots
skip.boots
Skip the bootstrapping calculations.
clonalDiversity(combined.TCR,
cloneCall = "gene")
There are 2 options for grouping in clonalDiversity()
and it is important to understand the difference.
group.by
x.axis
combined.TCR <- addVariable(combined.TCR,
variable.name = "Patient",
variables = c("P17", "P17", "P18", "P18",
"P19","P19", "P20", "P20"))
clonalDiversity(combined.TCR,
cloneCall = "gene",
group.by = "Patient")
clonalDiversity(combined.TCR,
cloneCall = "gene",
x.axis = "Patient")
As a default, clonalDiversity()
will return all the
metrics calculated - “shannon”,
“inv.simpson”, “norm.entropy”,
“gini.simpson”, “chao1”, and
“ACE”. Selecting a single or a subset of these methods
using the metrics parameter.
#Return only a subset of metrics
clonalDiversity(combined.TCR,
metrics = c("shannon", "ACE"),
cloneCall = "gene",
n.boots = 20)
We can also use Hill numbers to estimate the rarefaction, or estimating species richness, using the the abundance of clones across groupings. Underlying the rarefaction calculation is the use of observed receptor of abundance to compute diversity.
hill.numbers
plot.type
This function relies on the iNEXT with the accompanying manuscript. Like the other wrapping functions in scRepertoire, please cite the original work. The sample completeness curve (plot.type = 2), may not show full sample coverage due to the size/diversity of the input data.
clonalRarefaction(combined.TCR,
plot.type = 1,
hill.numbers = 0,
n.boots = 2)
clonalRarefaction(combined.TCR,
plot.type = 2,
hill.numbers = 0,
n.boots = 2)
clonalRarefaction(combined.TCR,
plot.type = 3,
hill.numbers = 0,
n.boots = 2)
clonalRarefaction(combined.TCR,
plot.type = 1,
hill.numbers = 1,
n.boots = 2)
clonalRarefaction(combined.TCR,
plot.type = 2,
hill.numbers = 1,
n.boots = 2)
clonalRarefaction(combined.TCR,
plot.type = 3,
hill.numbers = 1,
n.boots = 2)
Another method for modeling the repertoire distribution is a discrete gamma-GPD spliced threshold model, proposed by Koch et al. The spliced model models the repertoire and allows for the application of a power law distribution for larger clonal-expanded sequences and a Poisson distribution for smaller clones. After fitting the models, repertoires can be compared using Euclidean distance.
If using this function, please read/cite Koch et al. and check out the powerTCR R package.
clonalSizeDistribution(combined.TCR,
cloneCall = "aa",
method= "ward.D2")
If you are interested in measures of similarity between the samples
loaded into scRepertoire, using clonalOverlap()
can assist
in the visualization.
The underlying clonalOverlap()
calculation varies by the
method parameter, more information on the exact
calculations are available in the manual.
method
clonalOverlap(combined.TCR,
cloneCall = "strict",
method = "morisita")
clonalOverlap(combined.TCR,
cloneCall = "strict",
method = "raw")