vignettes/articles/Loading.Rmd
Loading.Rmd
scRepertoire functions using the filtered_contig_annotations.csv output from the 10x Genomics Cell Ranger. This file is located in the ./outs/ directory of the VDJ alignment folder. To generate a list of contigs to use for scRepertoire:
S1 <- read.csv(".../Sample1/outs/filtered_contig_annotations.csv")
S2 <- read.csv(".../Sample2/outs/filtered_contig_annotations.csv")
S3 <- read.csv(".../Sample3/outs/filtered_contig_annotations.csv")
S4 <- read.csv(".../Sample4/outs/filtered_contig_annotations.csv")
contig_list <- list(S1, S2, S3, S4)
Beyond the default 10x Genomic Cell Ranger pipeline outputs, scRepertoire supports the following single-cell formats:
loadContigs()
can be given a directory where the
sequencing experiments are located and it will recursively load and
process the contig data based on the file names. Alternatively,
loadContigs()
can be given a list of data frames and
process the contig data
#Directory example
contig.output <- c("~/Documents/MyExperiment")
contig.list <- loadContigs(input = contig.output,
format = "TRUST4")
#List of data frames example
S1 <- read.csv("~/Documents/MyExperiment/Sample1/outs/barcode_results.csv")
S2 <- read.csv("~/Documents/MyExperiment/Sample2/outs/barcode_results.csv")
S3 <- read.csv("~/Documents/MyExperiment/Sample3/outs/barcode_results.csv")
S4 <- read.csv("~/Documents/MyExperiment/Sample4/outs/barcode_results.csv")
contig_list <- list(S1, S2, S3, S4)
contig.list <- loadContigs(input = contig.output,
format = "WAT3R")
It is now easy to create the contig list from a multiplexed
experiment by first generating a single-cell RNA object (either Seurat
or Single Cell Experiment), loading the filtered contig file and then
using createHTOContigList()
. This function will return a
list separated by the group.by variable(s).
This function depends on the match of barcodes between the single-cell object and contigs. If there is a prefix or different suffix added to the barcode, this will result in no contigs recovered. Currently, it is recommended you do this step before the integration, as integration workflows commonly alter the barcodes. There is a multi.run variable that can be used on the integrated object. However, it assumes you have modified the barcodes with the Seurat pipeline (automatic addition of _# to end), and your contig list is in the same order.
contigs <- read.csv(".../outs/filtered_contig_annotations.csv")
contig.list <- createHTOContigList(contigs,
Seurat.Obj,
group.by = "HTO_maxID")
scRepertoire comes with a data set from T cells derived from four patients with acute respiratory distress to demonstrate the functionality of the R package. More information on the data set can be found in the corresponding manuscript. The samples consist of paired peripheral-blood (B) and bronchoalveolar lavage (L), effectively creating 8 distinct runs for T cell receptor (TCR) enrichment. We can preview the elements in the list by using the head function and looking at the first contig annotation.
The built-in example data is derived from the 10x Cell Ranger pipeline, so it is ready to go for downstream processing and analysis.
## barcode is_cell contig_id high_confidence length
## 1 AAACCTGAGTACGACG-1 True AAACCTGAGTACGACG-1_contig_1 True 500
## 2 AAACCTGAGTACGACG-1 True AAACCTGAGTACGACG-1_contig_2 True 478
## 4 AAACCTGCAACACGCC-1 True AAACCTGCAACACGCC-1_contig_1 True 506
## 5 AAACCTGCAACACGCC-1 True AAACCTGCAACACGCC-1_contig_2 True 470
## 6 AAACCTGCAGGCGATA-1 True AAACCTGCAGGCGATA-1_contig_1 True 558
## 7 AAACCTGCAGGCGATA-1 True AAACCTGCAGGCGATA-1_contig_2 True 505
## chain v_gene d_gene j_gene c_gene full_length productive
## 1 TRA TRAV25 None TRAJ20 TRAC True True
## 2 TRB TRBV5-1 None TRBJ2-7 TRBC2 True True
## 4 TRA TRAV38-2/DV8 None TRAJ52 TRAC True True
## 5 TRB TRBV10-3 None TRBJ2-2 TRBC2 True True
## 6 TRA TRAV12-1 None TRAJ9 TRAC True True
## 7 TRB TRBV9 None TRBJ2-2 TRBC2 True True
## cdr3 cdr3_nt
## 1 CGCSNDYKLSF TGTGGGTGTTCTAACGACTACAAGCTCAGCTTT
## 2 CASSLTDRTYEQYF TGCGCCAGCAGCTTGACCGACAGGACCTACGAGCAGTACTTC
## 4 CAYRSAQAGGTSYGKLTF TGTGCTTATAGGAGCGCGCAGGCTGGTGGTACTAGCTATGGAAAGCTGACATTT
## 5 CAISEQGKGELFF TGTGCCATCAGTGAACAGGGGAAAGGGGAGCTGTTTTTT
## 6 CVVSDNTGGFKTIF TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT
## 7 CASSVRRERANTGELFF TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT
## reads umis raw_clonotype_id raw_consensus_id
## 1 8344 4 clonotype123 clonotype123_consensus_2
## 2 65390 38 clonotype123 clonotype123_consensus_1
## 4 18372 8 clonotype124 clonotype124_consensus_1
## 5 34054 9 clonotype124 clonotype124_consensus_2
## 6 5018 2 clonotype1 clonotype1_consensus_2
## 7 25110 11 clonotype1 clonotype1_consensus_1