addVariable: Adding Variables for Plotting

What if there are more variables to add than just sample and ID? We can add them by using the addVariable() function. For each element, the function will add a column (labeled by variable.name) with the variable. The length of the variables parameter needs to match the length of the combined object.

Key Parameter(s) for addVariable()

  • variable.name: A character string that defines the new variable to add (e.g., “Type”, “Treatment”).
  • variables: A character vector defining the desired column value for each list element. Its length must match the number of elements in the input.data list.

As an example, here we add the Type in which the samples were processed and sequenced to the combined.TCR object:

combined.TCR <- addVariable(combined.TCR, 
                            variable.name = "Type", 
                            variables = rep(c("B", "L"), 4))

head(combined.TCR[[1]])
##                    barcode sample                     TCR1           cdr3_aa1
## 1  P17B_AAACCTGAGTACGACG-1   P17B       TRAV25.TRAJ20.TRAC        CGCSNDYKLSF
## 3  P17B_AAACCTGCAACACGCC-1   P17B TRAV38-2/DV8.TRAJ52.TRAC CAYRSAQAGGTSYGKLTF
## 5  P17B_AAACCTGCAGGCGATA-1   P17B      TRAV12-1.TRAJ9.TRAC     CVVSDNTGGFKTIF
## 7  P17B_AAACCTGCATGAGCGA-1   P17B      TRAV12-1.TRAJ9.TRAC     CVVSDNTGGFKTIF
## 9  P17B_AAACGGGAGAGCCCAA-1   P17B        TRAV20.TRAJ8.TRAC      CAVRGEGFQKLVF
## 10 P17B_AAACGGGAGCGTTTAC-1   P17B      TRAV12-1.TRAJ9.TRAC     CVVSDNTGGFKTIF
##                                                  cdr3_nt1
## 1                       TGTGGGTGTTCTAACGACTACAAGCTCAGCTTT
## 3  TGTGCTTATAGGAGCGCGCAGGCTGGTGGTACTAGCTATGGAAAGCTGACATTT
## 5              TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT
## 7              TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT
## 9                 TGTGCTGTGCGAGGAGAAGGCTTTCAGAAACTTGTATTT
## 10             TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT
##                           TCR2          cdr3_aa2
## 1   TRBV5-1.None.TRBJ2-7.TRBC2    CASSLTDRTYEQYF
## 3  TRBV10-3.None.TRBJ2-2.TRBC2     CAISEQGKGELFF
## 5     TRBV9.None.TRBJ2-2.TRBC2 CASSVRRERANTGELFF
## 7     TRBV9.None.TRBJ2-2.TRBC2 CASSVRRERANTGELFF
## 9                         <NA>              <NA>
## 10    TRBV9.None.TRBJ2-2.TRBC2 CASSVRRERANTGELFF
##                                               cdr3_nt2
## 1           TGCGCCAGCAGCTTGACCGACAGGACCTACGAGCAGTACTTC
## 3              TGTGCCATCAGTGAACAGGGGAAAGGGGAGCTGTTTTTT
## 5  TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 7  TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 9                                                 <NA>
## 10 TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
##                                                  CTgene
## 1         TRAV25.TRAJ20.TRAC_TRBV5-1.None.TRBJ2-7.TRBC2
## 3  TRAV38-2/DV8.TRAJ52.TRAC_TRBV10-3.None.TRBJ2-2.TRBC2
## 5          TRAV12-1.TRAJ9.TRAC_TRBV9.None.TRBJ2-2.TRBC2
## 7          TRAV12-1.TRAJ9.TRAC_TRBV9.None.TRBJ2-2.TRBC2
## 9                                  TRAV20.TRAJ8.TRAC_NA
## 10         TRAV12-1.TRAJ9.TRAC_TRBV9.None.TRBJ2-2.TRBC2
##                                                                                              CTnt
## 1                    TGTGGGTGTTCTAACGACTACAAGCTCAGCTTT_TGCGCCAGCAGCTTGACCGACAGGACCTACGAGCAGTACTTC
## 3  TGTGCTTATAGGAGCGCGCAGGCTGGTGGTACTAGCTATGGAAAGCTGACATTT_TGTGCCATCAGTGAACAGGGGAAAGGGGAGCTGTTTTTT
## 5  TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 7  TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 9                                                      TGTGCTGTGCGAGGAGAAGGCTTTCAGAAACTTGTATTT_NA
## 10 TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
##                                CTaa
## 1        CGCSNDYKLSF_CASSLTDRTYEQYF
## 3  CAYRSAQAGGTSYGKLTF_CAISEQGKGELFF
## 5  CVVSDNTGGFKTIF_CASSVRRERANTGELFF
## 7  CVVSDNTGGFKTIF_CASSVRRERANTGELFF
## 9                  CAVRGEGFQKLVF_NA
## 10 CVVSDNTGGFKTIF_CASSVRRERANTGELFF
##                                                                                                                                               CTstrict
## 1                           TRAV25.TRAJ20.TRAC;TGTGGGTGTTCTAACGACTACAAGCTCAGCTTT_TRBV5-1.None.TRBJ2-7.TRBC2;TGCGCCAGCAGCTTGACCGACAGGACCTACGAGCAGTACTTC
## 3  TRAV38-2/DV8.TRAJ52.TRAC;TGTGCTTATAGGAGCGCGCAGGCTGGTGGTACTAGCTATGGAAAGCTGACATTT_TRBV10-3.None.TRBJ2-2.TRBC2;TGTGCCATCAGTGAACAGGGGAAAGGGGAGCTGTTTTTT
## 5          TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 7          TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## 9                                                                                      TRAV20.TRAJ8.TRAC;TGTGCTGTGCGAGGAGAAGGCTTTCAGAAACTTGTATTT_NA;NA
## 10         TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
##    Type
## 1     B
## 3     B
## 5     B
## 7     B
## 9     B
## 10    B

subsetClones: Filter Out Clonal Information

Likewise, we can remove specific list elements after combineTCR() or combineBCR() using the subsetClones() function. In order to subset, we need to identify the column header we would like to use for subsetting (name) and the specific values to include (variables).

Key Parameter(s) for subsetClones()

  • name: The column header/name in the metadata of input.data to use for subsetting (e.g., “sample”, “Type”).
  • variables: A character vector of the specific values within the chosen name column to retain in the subsetted data.

Below, we isolate just the two sequencing results from “P18L” and “P18B” samples:

subset1 <- subsetClones(combined.TCR, 
                        name = "sample", 
                        variables = c("P18L", "P18B"))

head(subset1[[1]][,1:4])
##                    barcode sample                 TCR1         cdr3_aa1
## 1  P18B_AAACCTGAGGCTCAGA-1   P18B TRAV26-1.TRAJ37.TRAC  CIVRGGSSNTGKLIF
## 3  P18B_AAACCTGCATGACATC-1   P18B    TRAV3.TRAJ20.TRAC    CAVQRSNDYKLSF
## 5  P18B_AAACCTGGTATGCTTG-1   P18B TRAV26-1.TRAJ53.TRAC   CIGSSGGSNYKLTF
## 8  P18B_AAACGGGCAGATGGGT-1   P18B                 <NA>             <NA>
## 9  P18B_AAACGGGTCTTACCGC-1   P18B    TRAV20.TRAJ9.TRAC CAVQAKRYTGGFKTIF
## 12 P18B_AAAGATGAGTTACGGG-1   P18B   TRAV8-3.TRAJ8.TRAC   CAVGGDTGFQKLVF

Alternatively, we can also just select the list elements after combineTCR() or combineBCR().

subset2 <- combined.TCR[c(3,4)]
head(subset2[[1]][,1:4])
##                    barcode sample                 TCR1         cdr3_aa1
## 1  P18B_AAACCTGAGGCTCAGA-1   P18B TRAV26-1.TRAJ37.TRAC  CIVRGGSSNTGKLIF
## 3  P18B_AAACCTGCATGACATC-1   P18B    TRAV3.TRAJ20.TRAC    CAVQRSNDYKLSF
## 5  P18B_AAACCTGGTATGCTTG-1   P18B TRAV26-1.TRAJ53.TRAC   CIGSSGGSNYKLTF
## 8  P18B_AAACGGGCAGATGGGT-1   P18B                 <NA>             <NA>
## 9  P18B_AAACGGGTCTTACCGC-1   P18B    TRAV20.TRAJ9.TRAC CAVQAKRYTGGFKTIF
## 12 P18B_AAAGATGAGTTACGGG-1   P18B   TRAV8-3.TRAJ8.TRAC   CAVGGDTGFQKLVF

exportClones: Save Clonal Data

After assigning the clone by barcode, we can export the clonal information using exportClones() to save for later use or to integrate with other bioinformatics pipelines. This function supports various output formats tailored for different analytical needs.

Key Parameter(s) for exportClones() * format: The desired output format for the clonal data. * airr: Exports data in an Adaptive Immune Receptor Repertoire (AIRR) Community-compliant format, with each row representing a single receptor chain. * immunarch: Exports a list containing a data frame and metadata formatted for use with the immunarch package. * paired: Exports a data frame with paired chain information (amino acid, nucleotide, genes) per barcode. This is the default. * TCRMatch: Exports a data frame specifically for the TCRMatch algorithm, containing TRB chain amino acid sequence and clonal frequency. * tcrpheno: Exports a data frame compatible with the tcrpheno pipeline, with TRA and TRB chains in separate columns. * write.file: If TRUE (default), saves the output to a CSV file. If FALSE, returns the data frame or list to the R environment. * dir: The directory where the output file will be saved. Defaults to the current working directory. * file.name: The name of the CSV file to be saved.

To export the combined clonotypes as a paired data frame and save it to a specified directory:

exportClones(combined, 
             write.file = TRUE,
             dir = "~/Documents/MyExperiment/Sample1/"
             file.name = "clones.csv")

To return an immunarch-formatted data frame directly to your R environment without saving a file:

immunarch <- exportClones(combined.TCR, 
                          format = "immunarch", 
                          write.file = FALSE)
head(immunarch[[1]][[1]])
##   Clones   Proportion                                             CDR3.nt
## 1      1 0.0003565062    TGCGCCAGCAGTCGGGGACTAGCGGGATACAATGAGCAGTTCTTC;NA
## 2      1 0.0003565062       TGTGCCATCAGCGCGGACCCCCGCTACAATGAGCAGTTCTTC;NA
## 3      1 0.0003565062 TGTGCCAGCAGCTTGAGGGACAGCTATCGGTACTATGGCTACACCTTC;NA
## 4      2 0.0007130125          TGTGCCAGCAGCCGGCAGGGCGCAGATACGCAGTATTTT;NA
## 5      1 0.0003565062       TGTGCCAGCAGTCCCTTTACAGGGTTCTATGGCTACACCTTC;NA
## 6      1 0.0003565062          TGTGCCAGCTCATCCGGGATCAATCAGCCCCAGCATTTT;NA
##               CDR3.aa      V.name  D.name     J.name   C.name
## 1  CASSRGLAGYNEQFF;NA TRBV10-2;NA None;NA TRBJ2-1;NA TRBC2;NA
## 2   CAISADPRYNEQFF;NA TRBV10-3;NA None;NA TRBJ2-1;NA TRBC2;NA
## 3 CASSLRDSYRYYGYTF;NA TRBV11-3;NA None;NA TRBJ1-2;NA TRBC1;NA
## 4    CASSRQGADTQYF;NA TRBV11-3;NA None;NA TRBJ2-3;NA TRBC2;NA
## 5   CASSPFTGFYGYTF;NA TRBV12-4;NA None;NA TRBJ1-2;NA TRBC1;NA
## 6    CASSSGINQPQHF;NA   TRBV18;NA None;NA TRBJ1-5;NA TRBC1;NA
##                                           Barcode
## 1                         P17B_AGCGGTCCAAAGGAAG-1
## 2                         P17B_GGCTCGAGTCGCGGTT-1
## 3                         P17B_CGCGTTTTCGGCTACG-1
## 4 P17B_CACCAGGGTTCCTCCA-1;P17B_TCTATTGCAGGTGCCT-1
## 5                         P17B_GCTGGGTGTACGAAAT-1
## 6                         P17B_AGGGTGACATTGGTAC-1

annotateInvariant

The annotateInvariant() function enables the identification of mucosal-associated invariant T (MAIT) cells and invariant natural killer T (iNKT) cells in single-cell sequencing datasets. These specialized T-cell subsets are defined by their characteristic TCR usage, making them distinguishable within single-cell immune profiling data. The function extracts TCR chain information from the provided single-cell dataset and evaluates it against known invariant TCR criteria for either MAIT or iNKT cells. Each cell is assigned a score indicating the presence (1) or absence (0) of the specified invariant T-cell population.

Key Parameter(s) for annotateInavriant()

  • type: Character string specifying the type of invariant T cell to annotate (MAIT or iNKT).
  • species: Character string specifying the species (mouse or human).
combined <- annotateInvariant(combined, 
                              type = "MAIT", 
                              species = "human")

combined <- annotateInvariant(combined, 
                              type = "iNKT", 
                              species = "human")