iRepertoire
Here we will discuss how to process the data obtained using iRepertoire TCR LR kit. This is a multiplex protocol designed in such a way that forward primers are located in FR1 region of V gene and reverse primers are complimentary to constant region.
Data libraries
This tutorial uses the data from the following publication: Longitudinal High-Throughput Sequencing of the T-Cell Receptor Repertoire Reveals Dynamic Change and Prognostic Significance of Peripheral Blood TCR Diversity in Metastatic Colorectal Cancer During Chemotherapy Yi-Tung Chen et al., Front. Immunol., 2022 Jan;12:743448 doi: 10.3389/fimmu.2021.743448
A total of 36 subjects, including 20 healthy controls and 16 metastatic CRC patients, were enrolled in this study.Peripheral blood samples were obtained from 20 age-matched healthy controls (62.6 ± 10.48 years old) and 16 CRC patients (62.38 ± 12.62 years old) before therapy. Among the 16 CRC patients, 67 peripheral blood samples were collected from 13 patients with follow-up every two months for approximately 98 to 452 days. 103 samples in total. Peripheral blood mononuclear cells (PBMCs) were isolated following the standard procedure, and total RNA from PBMCs was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer’s protocol. A multiplex PCR amplification reaction was used to amplify the TCR immune repertoire. Human TCRα and TCRβ libraries were prepared using the HTAI-M and HTBI-M Kits (iRepertoire, Inc.) according to the manufacturer’s instructions and 2 × 250 bp paired-end sequenced was performed on the Illumina MiSeq platform.
All data is available from SRA (PRJNA754274) using e.g. SRA Explorer.
Use aria2c for efficient download of the full dataset with the proper filenames:
mkdir -p raw
aria2c -c -s 16 -x 16 -k 1M -j 8 -i download-list.txt
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365468/SRR8365468_1.fastq.gz
out=raw/SRR8365468_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365468/SRR8365468_2.fastq.gz
out=raw/SRR8365468_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365457/SRR8365457_1.fastq.gz
out=raw/SRR8365457_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365457/SRR8365457_2.fastq.gz
out=raw/SRR8365457_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365458/SRR8365458_1.fastq.gz
out=raw/SRR8365458_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365458/SRR8365458_2.fastq.gz
out=raw/SRR8365458_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365459/SRR8365459_1.fastq.gz
out=raw/SRR8365459_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365459/SRR8365459_2.fastq.gz
out=raw/SRR8365459_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365463/SRR8365463_1.fastq.gz
out=raw/SRR8365463_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365463/SRR8365463_2.fastq.gz
out=raw/SRR8365463_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365469/SRR8365469_1.fastq.gz
out=raw/SRR8365469_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365469/SRR8365469_2.fastq.gz
out=raw/SRR8365469_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365465/SRR8365465_1.fastq.gz
out=raw/SRR8365465_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365465/SRR8365465_2.fastq.gz
out=raw/SRR8365465_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365467/SRR8365467_1.fastq.gz
out=raw/SRR8365467_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365467/SRR8365467_2.fastq.gz
out=raw/SRR8365467_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365464/SRR8365464_1.fastq.gz
out=raw/SRR8365464_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365464/SRR8365464_2.fastq.gz
out=raw/SRR8365464_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365450/SRR8365450_1.fastq.gz
out=raw/SRR8365450_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365450/SRR8365450_2.fastq.gz
out=raw/SRR8365450_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365461/SRR8365461_1.fastq.gz
out=raw/SRR8365461_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365461/SRR8365461_2.fastq.gz
out=raw/SRR8365461_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365462/SRR8365462_1.fastq.gz
out=raw/SRR8365462_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365462/SRR8365462_2.fastq.gz
out=raw/SRR8365462_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365456/SRR8365456_1.fastq.gz
out=raw/SRR8365456_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365456/SRR8365456_2.fastq.gz
out=raw/SRR8365456_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365475/SRR8365475_1.fastq.gz
out=raw/SRR8365475_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365475/SRR8365475_2.fastq.gz
out=raw/SRR8365475_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365460/SRR8365460_1.fastq.gz
out=raw/SRR8365460_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365460/SRR8365460_2.fastq.gz
out=raw/SRR8365460_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365470/SRR8365470_1.fastq.gz
out=raw/SRR8365470_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365470/SRR8365470_2.fastq.gz
out=raw/SRR8365470_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365471/SRR8365471_1.fastq.gz
out=raw/SRR8365471_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365471/SRR8365471_2.fastq.gz
out=raw/SRR8365471_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365473/SRR8365473_1.fastq.gz
out=raw/SRR8365473_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365473/SRR8365473_2.fastq.gz
out=raw/SRR8365473_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365449/SRR8365449_1.fastq.gz
out=raw/SRR8365449_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365449/SRR8365449_2.fastq.gz
out=raw/SRR8365449_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365482/SRR8365482_1.fastq.gz
out=raw/SRR8365482_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365482/SRR8365482_2.fastq.gz
out=raw/SRR8365482_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365446/SRR8365446_1.fastq.gz
out=raw/SRR8365446_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365446/SRR8365446_2.fastq.gz
out=raw/SRR8365446_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365483/SRR8365483_1.fastq.gz
out=raw/SRR8365483_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365483/SRR8365483_2.fastq.gz
out=raw/SRR8365483_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365447/SRR8365447_1.fastq.gz
out=raw/SRR8365447_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365447/SRR8365447_2.fastq.gz
out=raw/SRR8365447_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365484/SRR8365484_1.fastq.gz
out=raw/SRR8365484_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365484/SRR8365484_2.fastq.gz
out=raw/SRR8365484_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365448/SRR8365448_1.fastq.gz
out=raw/SRR8365448_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365448/SRR8365448_2.fastq.gz
out=raw/SRR8365448_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365424/SRR8365424_1.fastq.gz
out=raw/SRR8365424_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365424/SRR8365424_2.fastq.gz
out=raw/SRR8365424_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365485/SRR8365485_1.fastq.gz
out=raw/SRR8365485_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365485/SRR8365485_2.fastq.gz
out=raw/SRR8365485_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365488/SRR8365488_1.fastq.gz
out=raw/SRR8365488_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365488/SRR8365488_2.fastq.gz
out=raw/SRR8365488_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365421/SRR8365421_1.fastq.gz
out=raw/SRR8365421_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365421/SRR8365421_2.fastq.gz
out=raw/SRR8365421_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365489/SRR8365489_1.fastq.gz
out=raw/SRR8365489_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365489/SRR8365489_2.fastq.gz
out=raw/SRR8365489_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365490/SRR8365490_1.fastq.gz
out=raw/SRR8365490_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365490/SRR8365490_2.fastq.gz
out=raw/SRR8365490_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365246/SRR8365246_1.fastq.gz
out=raw/SRR8365246_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365246/SRR8365246_2.fastq.gz
out=raw/SRR8365246_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365474/SRR8365474_1.fastq.gz
out=raw/SRR8365474_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365474/SRR8365474_2.fastq.gz
out=raw/SRR8365474_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365422/SRR8365422_1.fastq.gz
out=raw/SRR8365422_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365422/SRR8365422_2.fastq.gz
out=raw/SRR8365422_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365423/SRR8365423_1.fastq.gz
out=raw/SRR8365423_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365423/SRR8365423_2.fastq.gz
out=raw/SRR8365423_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365420/SRR8365420_1.fastq.gz
out=raw/SRR8365420_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365420/SRR8365420_2.fastq.gz
out=raw/SRR8365420_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365419/SRR8365419_1.fastq.gz
out=raw/SRR8365419_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365419/SRR8365419_2.fastq.gz
out=raw/SRR8365419_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365248/SRR8365248_1.fastq.gz
out=raw/SRR8365248_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365248/SRR8365248_2.fastq.gz
out=raw/SRR8365248_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365249/SRR8365249_1.fastq.gz
out=raw/SRR8365249_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365249/SRR8365249_2.fastq.gz
out=raw/SRR8365249_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365247/SRR8365247_1.fastq.gz
out=raw/SRR8365247_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365247/SRR8365247_2.fastq.gz
out=raw/SRR8365247_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365250/SRR8365250_1.fastq.gz
out=raw/SRR8365250_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365250/SRR8365250_2.fastq.gz
out=raw/SRR8365250_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365348/SRR8365348_1.fastq.gz
out=raw/SRR8365348_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365348/SRR8365348_2.fastq.gz
out=raw/SRR8365348_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365251/SRR8365251_1.fastq.gz
out=raw/SRR8365251_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365251/SRR8365251_2.fastq.gz
out=raw/SRR8365251_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365418/SRR8365418_1.fastq.gz
out=raw/SRR8365418_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365418/SRR8365418_2.fastq.gz
out=raw/SRR8365418_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365310/SRR8365310_1.fastq.gz
out=raw/SRR8365310_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365310/SRR8365310_2.fastq.gz
out=raw/SRR8365310_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365252/SRR8365252_1.fastq.gz
out=raw/SRR8365252_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365252/SRR8365252_2.fastq.gz
out=raw/SRR8365252_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365308/SRR8365308_1.fastq.gz
out=raw/SRR8365308_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365308/SRR8365308_2.fastq.gz
out=raw/SRR8365308_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365309/SRR8365309_1.fastq.gz
out=raw/SRR8365309_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365309/SRR8365309_2.fastq.gz
out=raw/SRR8365309_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365253/SRR8365253_1.fastq.gz
out=raw/SRR8365253_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365253/SRR8365253_2.fastq.gz
out=raw/SRR8365253_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365307/SRR8365307_1.fastq.gz
out=raw/SRR8365307_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365307/SRR8365307_2.fastq.gz
out=raw/SRR8365307_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365259/SRR8365259_1.fastq.gz
out=raw/SRR8365259_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365259/SRR8365259_2.fastq.gz
out=raw/SRR8365259_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365258/SRR8365258_1.fastq.gz
out=raw/SRR8365258_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365258/SRR8365258_2.fastq.gz
out=raw/SRR8365258_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365306/SRR8365306_1.fastq.gz
out=raw/SRR8365306_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365306/SRR8365306_2.fastq.gz
out=raw/SRR8365306_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365305/SRR8365305_1.fastq.gz
out=raw/SRR8365305_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365305/SRR8365305_2.fastq.gz
out=raw/SRR8365305_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365304/SRR8365304_1.fastq.gz
out=raw/SRR8365304_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365304/SRR8365304_2.fastq.gz
out=raw/SRR8365304_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365260/SRR8365260_1.fastq.gz
out=raw/SRR8365260_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365260/SRR8365260_2.fastq.gz
out=raw/SRR8365260_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365303/SRR8365303_1.fastq.gz
out=raw/SRR8365303_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365303/SRR8365303_2.fastq.gz
out=raw/SRR8365303_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365261/SRR8365261_1.fastq.gz
out=raw/SRR8365261_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365261/SRR8365261_2.fastq.gz
out=raw/SRR8365261_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365262/SRR8365262_1.fastq.gz
out=raw/SRR8365262_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365262/SRR8365262_2.fastq.gz
out=raw/SRR8365262_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365302/SRR8365302_1.fastq.gz
out=raw/SRR8365302_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365302/SRR8365302_2.fastq.gz
out=raw/SRR8365302_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365301/SRR8365301_1.fastq.gz
out=raw/SRR8365301_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365301/SRR8365301_2.fastq.gz
out=raw/SRR8365301_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365263/SRR8365263_1.fastq.gz
out=raw/SRR8365263_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365263/SRR8365263_2.fastq.gz
out=raw/SRR8365263_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365264/SRR8365264_1.fastq.gz
out=raw/SRR8365264_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365264/SRR8365264_2.fastq.gz
out=raw/SRR8365264_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365267/SRR8365267_1.fastq.gz
out=raw/SRR8365267_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365267/SRR8365267_2.fastq.gz
out=raw/SRR8365267_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365274/SRR8365274_1.fastq.gz
out=raw/SRR8365274_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365274/SRR8365274_2.fastq.gz
out=raw/SRR8365274_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365269/SRR8365269_1.fastq.gz
out=raw/SRR8365269_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365269/SRR8365269_2.fastq.gz
out=raw/SRR8365269_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365271/SRR8365271_1.fastq.gz
out=raw/SRR8365271_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365271/SRR8365271_2.fastq.gz
out=raw/SRR8365271_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365273/SRR8365273_1.fastq.gz
out=raw/SRR8365273_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365273/SRR8365273_2.fastq.gz
out=raw/SRR8365273_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365265/SRR8365265_1.fastq.gz
out=raw/SRR8365265_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365265/SRR8365265_2.fastq.gz
out=raw/SRR8365265_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365270/SRR8365270_1.fastq.gz
out=raw/SRR8365270_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365270/SRR8365270_2.fastq.gz
out=raw/SRR8365270_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365272/SRR8365272_1.fastq.gz
out=raw/SRR8365272_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365272/SRR8365272_2.fastq.gz
out=raw/SRR8365272_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365266/SRR8365266_1.fastq.gz
out=raw/SRR8365266_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365266/SRR8365266_2.fastq.gz
out=raw/SRR8365266_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365268/SRR8365268_1.fastq.gz
out=raw/SRR8365268_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365268/SRR8365268_2.fastq.gz
out=raw/SRR8365268_HIP1_female_R2.fastq.gz
The project contains 103 FASTQ file pairs. For the purpose of this tutorial we assume that all fastq files are stored in fastq/ folder. The structure of sequences is shown on the picture bellow. The data was obtained using multiplex primers for V and J genes. Below you can see the structure of cDNA library.
Upstream analysis
MiXCR has a dedicated preset for this protocol, thus analysing the data ia as easy as:
shell mixcr analyze irepertoire-human-rna-xcr-repseq-sr \ raw/CRC016_preTherapy_R1.fastq.gz \ raw/CRC016_preTherapy_R2.fastq.gz \ results/CRC016_preTherapy
One might also use GNU Parallel to process all samples at once:
```shell
!/usr/bin/env bash
mkdir -p results
ls /raw/R1 | parallel -j 2 --line-buffer \ "mixcr analyze irepertoire-human-rna-xcr-repseq-sr \ {} \ {=s:R1:R2:=} \ {=s:./:results/:;s:_R.::=}" ```
Under the hood pipeline:
Under the hood irepertoire-human-rna-xcr-repseq-sr executes the following pipeline:
align
Alignment of raw sequencing reads against reference database of V-, D-, J- and C- gene segments.
shell mixcr align \ --species hsa \ -p default_4.0 \ -OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP" \ -OvParameters.parameters.floatingLeftBound=true \ -OjParameters.parameters.floatingRightBound=false \ -OcParameters.parameters.floatingRightBound=true \ --report results/CRC016_preTherapy.report.txt \ --json-report results/CRC016_preTherapy.report.json \ raw/CRC016_preTherapy_R1.fastq.gz \ raw/CRC016_preTherapy_R2.fastq.gz \ results/CRC016_preTherapy.vdjca
Option --report is specified here explicitly.
--species hsa- determines the organism species.
-pgeneric-amplicona preset of MiXCR parameters for amplicon data .-OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP"- Sets a V gene feature to align. Check gene features for more info.
-OvParameters.parameters.floatingLeftBound=true- Results in a local alignment algorithm for V gene left bound due to the presence of primer sequences in V-gene region.
-OjParameters.parameters.floatingRightBound=false -OcParameters.parameters.floatingRightBound=true- Results in a global alignment algorithm for J gene right bound and a local alignment algorithm for C-gene right bound due to the presence of primer sequences.
assemble
Assembles alignments into clonotypes and applies several layers of errors correction(ex. quality-awared correction for sequencing errors, clustering to correct for PCR errors). Check mixcr assemble for more information. By default, clones will be assembled by CDR3 gene feature.
-OseparateByJ=true- Split clones with the same
CDR3sequence and different J-genes
shell mixcr assemble \ -OassemblingFeatures="CDR3" \ -OseparateByJ=true \ --report results/CRC016_preTherapy.report.txt \ --json-report results/CRC016_preTherapy.report.json \ results/CRC016_preTherapy.vdjca \ results/CRC016_preTherapy.clns
export
Exports clonotypes from .clns file into human-readable tables.
```shell mixcr exportClones \ -c IGH \ results/CRC016_preTherapy.clns \ results/CRC016_preTherapy.clonotypes.TRA.tsv
mixcr exportClones \ -c IGL \ results/CRC016_preTherapy.clns \ results/CRC016_preTherapy.clonotypes.TRB.tsv ```
-с <chain>- defines a specific chain to be exported.
After execution is complete the following list of files is generated for every sample:
```shell
human-readable reports
CRC016_preTherapy.report
raw alignments (highly compressed binary file)
CRC016_preTherapy.vdjca
TRA, TRB CDR3 clonotypes (highly compressed binary file)
CRC016_preTherapy.clns
TRA, TRB CDR3 clonotypes exported in tab-delimited txt
CRC016_preTherapy.TRA.tsv CRC016_preTherapy.TRB.tsv
```
While .clns file holds all data and is used for downstream analysis using mixcr postanalisis, the output .tsv clonotype table will contain exhaustive information about each clonotype as well:
See first 100 records from clonotype table CRC016_preTherapy:
| cloneId | cloneCount | cloneFraction | targetSequences | targetQualities | allVHitsWithScore | allDHitsWithScore | allJHitsWithScore | allCHitsWithScore | allVAlignments | allDAlignments | allJAlignments | allCAlignments | nSeqFR1 | minQualFR1 | nSeqCDR1 | minQualCDR1 | nSeqFR2 | minQualFR2 | nSeqCDR2 | minQualCDR2 | nSeqFR3 | minQualFR3 | nSeqCDR3 | minQualCDR3 | nSeqFR4 | minQualFR4 | aaSeqFR1 | aaSeqCDR1 | aaSeqFR2 | aaSeqCDR2 | aaSeqFR3 | aaSeqCDR3 | aaSeqFR4 | refPoints |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 32290 | 0.0900717 | TGTGCCAGCAGCACCTGGACAGGGAGTGGGGATGAGCAGTTCTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV7-9*00(1141.7) | TRBD1*00(41) | TRBJ2-1*00(209.2) | TRBC2*00(258.2) | 273 | 285 | 310 | 0 | 12 | 60.0 | 10 | 21 | 36 | 13 | 24 | SG12T | 41.0 | 28 | 42 | 70 | 31 | 45 | 70.0 | nan | nan | nan | nan | nan | ||
| 1 | 25443 | 0.0709723 | TGTGCCAGCAGCCCCTTTGAGGGACAGGGGCGCTTCGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGGHGGHHHHHHHHHHHG | TRBV3-1*00(1156.4) | TRBD1*00(50) | TRBJ2-7*00(205.2) | TRBC2*00(258.1) | 270 | 283 | 307 | 0 | 13 | 65.0 | 12 | 22 | 36 | 20 | 30 | 50.0 | 23 | 39 | 67 | 32 | 48 | SA25T | 66.0 | nan | nan | nan | nan | nan | ||
| 2 | 9071 | 0.0253032 | TGTGCCAGCAGTTACTCGAACCCGGACAGTGTTGGGCCCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHGGGGGHHHHHGGGHGGGHGGGGGGHHHHHHHHHHHG | TRBV6-5*00(1207.6) | TRBD1*00(33) | TRBJ2-7*00(224.2) | TRBC2*00(258.1) | 270 | 289 | 307 | 0 | 19 | 95.0 | 9 | 19 | 36 | 20 | 29 | DG12 | 33.0 | 22 | 39 | 67 | 37 | 54 | 85.0 | nan | nan | nan | nan | nan | ||
| 3 | 7841 | 0.0218722 | TGTGCCTGGACAAAACCGGGCCAGGGTATCGCTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGGHHHHHHHHGGHGGHHHHHHHHHHHHHHH | TRBV30*00(1146.3) | TRBD1*00(41) | TRBJ1-1*00(209.1) | TRBC1*00(257.8) | 270 | 280 | 304 | 0 | 10 | 50.0 | 10 | 21 | 36 | 15 | 26 | SA15C | 41.0 | 26 | 40 | 68 | 31 | 45 | 70.0 | nan | nan | nan | nan | nan | ||
| 4 | 6356 | 0.0177298 | TGCAGTGCCCGGGGGGGCCTCCATAGCAATCAGCCCCAGCATTTT | HHHHGGGGGGGGGGGGGGHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV20-1*00(1182.6) | TRBD2*00(39) | TRBJ1-5*00(254.2) | TRBC1*00(258) | 279 | 287 | 313 | 0 | 8 | 40.0 | 24 | 38 | 48 | 9 | 22 | DA28SC32G | 39.0 | 19 | 42 | 70 | 22 | 45 | 115.0 | nan | nan | nan | nan | nan | ||
| 5 | 4823 | 0.0134536 | TGTGCCTGGAGACGGAGCACAGATACGCAGTATTTT | HHHHHHHHHHHHGHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1160.5) | nan | TRBJ2-3*00(244.2) | TRBC2*00(257.8) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 20 | 41 | 69 | 15 | 36 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 6 | 4392 | 0.0122513 | TGTGCCTCGGGGACCTACGAGCAGTACTTC | HHGGHHHGHHHHHGGGGGHHHHHHHHHHHG | TRBV7-6*00(1117.5) | nan | TRBJ2-7*00(224.1) | TRBC2*00(258) | 273 | 279 | 310 | 0 | 6 | 30.0 | nan | 22 | 39 | 67 | 13 | 30 | 85.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 7 | 3785 | 0.0105581 | TGTGCCAGCAGCTTCTCGGATAGCAGAGAGACCCAGTACTTC | HHHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-4*00(1159.7) | nan | TRBJ2-5*00(220.1) | TRBC2*00(257.9) | 270 | 284 | 306 | 0 | 14 | 70.0 | nan | 13 | 40 | 68 | 13 | 42 | ST16CI19ASC21GI24G | 83.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 8 | 3443 | 0.00960412 | TGTGCCAGCAGCGTAACCGGGACTTACCCCCCAGATACGCAGTATTTT | HHHHHHHGGGGGGGGGGGGHHHHHGHGGGHHHHHGGGGHHHHHHHHHH | TRBV9*00(1160.3) | TRBD200(40),TRBD100(35) | TRBJ2-3*00(224.1) | TRBC2*00(258.1) | 270 | 285 | 306 | 0 | 15 | 75.0 | 14 | 22 | 48 | 16 | 24 | 40.0;10 | 17 | 36 | 16 | 23 | 35.0 | 24 | 41 | 69 | 31 | 48 | ||||
| 9 | 2785 | 0.00776865 | TGCGCCAGCAGCTTGGAACAGGGGGCGCGGACTGAAAAACTGTTTTTT | HHHGHHHHHHGHHHHHHHGGGGGGGGGGHHHHHHGHHHHHHGGHHHHH | TRBV5-1*00(1209.1) | TRBD1*00(55) | TRBJ1-4*00(219.1) | TRBC1*00(258.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | 15 | 26 | 36 | 17 | 28 | 55.0 | 27 | 43 | 71 | 32 | 48 | 80.0 | nan | nan | nan | nan | nan | |||
| 10 | 2619 | 0.0073056 | TGCGCCAGCAATGAGTGGGGGGTCGGCACTGAAGCTTTCTTT | HHHGHHHHHHHHHHHHHGHGGGGGHHHHHHHHHHHHHHHHHH | TRBV10-2*00(1183.2) | TRBD1*00(25) | TRBJ1-1*00(219.1) | TRBC1*00(257.8) | 270 | 286 | 307 | 0 | 16 | SG280A | 66.0 | 18 | 23 | 36 | 16 | 21 | 25.0 | 24 | 40 | 68 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | ||
| 11 | 2355 | 0.00656918 | TGTGCCAGCTCACCGGGTCGTGGAACTGAAGCTTTCTTT | HHHHHHHHHGGGGGGGGGGHHHHHHHHHHHHHHHHHHHH | TRBV18*00(1172.8) | nan | TRBJ1-1*00(214.2) | TRBC1*00(258) | 273 | 287 | 310 | 0 | 14 | 70.0 | nan | 25 | 40 | 68 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 12 | 2312 | 0.00644924 | TGTGCCAGCTCTCAGCAAGCCCAGACCGGGGAGCTGTTTTTT | HHHHHHHHHHHHHHHGHHHHHHGGGGGGGHHHHHHGGHHHHH | TRBV28*00(1129.2) | nan | TRBJ2-2*00(229.1) | TRBC2*00(258.2) | 270 | 279 | 307 | 0 | 9 | 45.0 | nan | 25 | 43 | 71 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 13 | 1965 | 0.00548129 | TGTGCCAGCAGCGGACAGAGAACTATGAACACTGAAGCTTTCTTT | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV9*00(1151.4) | TRBD1*00(25) | TRBJ1-1*00(244.2) | TRBC1*00(257.4) | 270 | 283 | 306 | 0 | 13 | 65.0 | 14 | 19 | 36 | 13 | 18 | 25.0 | 19 | 40 | 68 | 24 | 45 | 105.0 | nan | nan | nan | nan | nan | |||
| 14 | 1902 | 0.00530556 | TGTGCCAGTTGGGGAGGCGAGCAGTACTTC | HHHHHHHGGHHGHGGGGGHHHHHHHGHHHG | TRBV6-5*00(1159.5) | TRBD2*00(30) | TRBJ2-7*00(204.1) | TRBC2*00(258.2) | 270 | 278 | 307 | 0 | 8 | 40.0 | 25 | 31 | 48 | 11 | 17 | 30.0 | 26 | 39 | 67 | 17 | 30 | 65.0 | nan | nan | nan | nan | nan | |||
| 15 | 1578 | 0.00440177 | TGTGCCAGCAGCCATCGGGACAGAAACTACGAGCAGTACTTC | HHHHHHHGGHHGGGGGHHHHHHHHHGGGGGHHHHHHHHHHHG | TRBV7-9*00(1137.3) | TRBD1*00(40) | TRBJ2-7*00(219.3) | TRBC2*00(258) | 273 | 285 | 310 | 0 | 12 | 60.0 | 11 | 19 | 36 | 15 | 23 | 40.0 | 23 | 39 | 67 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | |||
| 16 | 1003 | 0.00279783 | TGTGCCAGCAGCTTAGGGACAGATACGCAGTATTTT | HHHHHHHHHHHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV13*00(1188) | nan | TRBJ2-3*00(229.2) | TRBC2*00(258.2) | 270 | 287 | 307 | 0 | 17 | 85.0 | nan | 23 | 41 | 69 | 18 | 36 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 17 | 992 | 0.00276715 | TGTGCCTGGAGTGACAGGGTAGAGACCCAGTACTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHG | TRBV30*00(1168.5) | nan | TRBJ2-5*00(219.3) | TRBC2*00(258.1) | 270 | 287 | 304 | 0 | 16 | DT283 | 68.0 | nan | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 18 | 962 | 0.00268346 | TGTGCCTGGATGGAGTCTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1154.9) | nan | TRBJ1-1*00(208.8) | TRBC1*00(257.7) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 40 | 68 | 16 | 30 | 70.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 19 | 897 | 0.00250215 | TGTGCCAGCTCACCACAGGGGGGGACTGAAGCTTTCTTT | HHHGHHHHHHGHHHHGGGGGHHHHHHHHHHHHHHHHHHH | TRBV18*00(1179.8) | nan | TRBJ1-1*00(208.1) | TRBC1*00(258.1) | 273 | 302 | 310 | 0 | 28 | SC289AST292GST295GDG298 | 86.0 | nan | 25 | 40 | 68 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 20 | 888 | 0.00247704 | TGCGCCAGCAAGACACAGGGCTCGAAGGCTGAAGCTTTCTTT | HHHGHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHHHHHHHHH | TRBV5-1*00(1181.4) | TRBD1*00(30) | TRBJ1-1*00(208.5) | TRBC1*00(257.6) | 270 | 280 | 306 | 0 | 10 | 50.0 | 15 | 21 | 36 | 14 | 20 | 30.0 | 26 | 40 | 68 | 28 | 42 | 70.0 | nan | nan | nan | nan | nan | |||
| 21 | 877 | 0.00244636 | TGTGCCTGGAGTGATCGGGTGGAGACCCAGTACTTC | HHHHHHHHHHHGGGGGGHHHHHHHHHHHHHHHHHHG | TRBV30*00(1169.1) | nan | TRBJ2-5*00(213.9) | TRBC2*00(258.1) | 270 | 283 | 304 | 0 | 13 | 65.0 | nan | 25 | 40 | 68 | 21 | 36 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 22 | 734 | 0.00204747 | TGCAGCGTTGTGGACAGGGGAGAGACCCAGTACTTC | HHHHHHGHHHHHHHHHGHHHHHHHHHHHHHHHHHHH | TRBV29-1*00(1210) | TRBD1*00(45) | TRBJ2-5*00(219.2) | TRBC2*00(257.8) | 276 | 286 | 310 | 0 | 10 | 50.0 | 13 | 22 | 36 | 11 | 20 | 45.0 | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | |||
| 23 | 693 | 0.0019331 | TGTGCCAGTAGTATAAGCTCGAACGGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHGHHGHHHGHHHHHHHHHHHHHHHHHHHHH | TRBV19*00(1159.9) | nan | TRBJ1-1*00(233.8) | TRBC1*00(258) | 270 | 285 | 307 | 0 | 15 | 75.0 | nan | 21 | 40 | 68 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 24 | 692 | 0.00193031 | TGTGCCTGGAGTGTACTAGGGGGTAGTCAGCCCCAGCATTTT | HHHHHHHHHHHHHHHHGGGHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1187.8) | TRBD1*00(30) | TRBJ1-5*00(218.5) | TRBC1*00(258) | 270 | 286 | 304 | 0 | 16 | 80.0 | 17 | 23 | 36 | 17 | 23 | 30.0 | 26 | 42 | 70 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | |||
| 25 | 678 | 0.00189126 | TGTGCCTGGACCAAGGGACTAGCGGGGGTCAATGAGCAGTTCTTC | HHHHHHHHHHHHHHHHHHGGGGGGGHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1154) | TRBD2*00(60) | TRBJ2-1*00(219.1) | TRBC2*00(256.8) | 270 | 280 | 304 | 0 | 10 | 50.0 | 16 | 28 | 48 | 14 | 26 | 60.0 | 26 | 42 | 70 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | |||
| 26 | 667 | 0.00186057 | TGTGCCTGGAGTGTGGGGGCCAGGCCATATAGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGGGGGGHHHHGHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1167.5) | TRBD1*00(30) | TRBJ1-5*00(259.1) | TRBC1*00(257.3) | 270 | 284 | 304 | 0 | 14 | 70.0 | 18 | 24 | 36 | 14 | 20 | 30.0 | 18 | 42 | 70 | 27 | 51 | 120.0 | nan | nan | nan | nan | nan | |||
| 27 | 649 | 0.00181036 | TGTGCCAGCAGCCAAGACGGGGGATCCAGCTCCTACGAGCAGTACTTC | HHHHHHHHGHHHHHHHHHGGHHHHHHHHHHHGGGGGHHHHHHHGGHGG | TRBV3-1*00(1172.3) | TRBD1*00(25) | TRBJ2-7*00(244.1) | TRBC2*00(257.8) | 270 | 287 | 307 | 0 | 17 | 85.0 | 18 | 23 | 36 | 18 | 23 | 25.0 | 18 | 39 | 67 | 27 | 48 | 105.0 | nan | nan | nan | nan | nan | |||
| 28 | 638 | 0.00177968 | TGCAGCGTCTGGGACAGGGAGGCCTACACCTTC | HHHHGGGHHHHHHHHHGHHHHHHHHHHHHHHHH | TRBV29-1*00(1214.7) | TRBD1*00(48) | TRBJ1-2*00(189) | TRBC1*00(258.1) | 276 | 284 | 310 | 0 | 8 | 40.0 | 12 | 24 | 36 | 10 | 23 | I21A | 48.0 | 30 | 40 | 68 | 23 | 33 | 50.0 | nan | nan | nan | nan | nan | ||
| 29 | 626 | 0.0017462 | TGTGCCTGGAGTGGGGGGGGCGCGGCGCGGGGCACTGAAGCTTTCTTT | HHHHHHHHHHGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1170.2) | TRBD1*00(40) | TRBJ1-1*00(219) | TRBC1*00(257.7) | 270 | 283 | 304 | 0 | 13 | 65.0 | 18 | 26 | 36 | 15 | 23 | 40.0 | 24 | 40 | 68 | 32 | 48 | 80.0 | nan | nan | nan | nan | nan | |||
| 30 | 622 | 0.00173505 | TGCGCCAGCAGTATAGAAGACGCCCGTAATGAAAAACTGTTTTTT | HGGGHHHHHHHHHHHHGGGGGGGGGHHHHHHGHHHHHHGGHHHHH | TRBV5-1*00(1189.2) | TRBD1*00(25) | TRBJ1-4*00(234) | TRBC1*00(257.8) | 270 | 281 | 306 | 0 | 11 | 55.0 | 23 | 28 | 36 | 20 | 25 | 25.0 | 24 | 43 | 71 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | |||
| 31 | 599 | 0.00167089 | TGTGCCAGCAGCGAGATCGGGGGTCCGAGCTCCTACGAGCAGTACTTC | HHHHHHHGGGGHHHGGGGGGGGGGGGHHHHHGGGGGHHHHHHHGHHHG | TRBV5-4*00(1144.3) | TRBD1*00(25) | TRBJ2-7*00(249.4) | TRBC2*00(258.1) | 270 | 282 | 306 | 0 | 12 | 60.0 | 18 | 23 | 36 | 18 | 23 | 25.0 | 17 | 39 | 67 | 26 | 48 | 110.0 | nan | nan | nan | nan | nan | |||
| 32 | 594 | 0.00165694 | TGTGCCTGGACTGGTGGGGCTAGCACAGATACGCAGTATTTT | HHHHHHHHHHHHGGGGHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1157) | nan | TRBJ2-3*00(259.3) | TRBC2*00(258.2) | 270 | 283 | 304 | 0 | 13 | SG280C | 51.0 | nan | 17 | 41 | 69 | 18 | 42 | 120.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 33 | 552 | 0.00153978 | TGTGCCTGGAGTGTCCAGGGCGTCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHGGGGGHGGGGGHHHHHHHGHHHH | TRBV30*00(1162.2) | TRBD1*00(25) | TRBJ2-7*00(219.2) | TRBC2*00(258.3) | 270 | 287 | 304 | 0 | 17 | SA284C | 71.0 | 20 | 25 | 36 | 17 | 22 | 25.0 | 23 | 39 | 67 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | ||
| 34 | 520 | 0.00145052 | TGTGCCAGCAGCCACCTTGAAACAACCCAAGAGACCCAGTACTTC | HHHHHHHGGHHGHHHHHHHHHGGGGHHHHHHHHHHHHHHHGHHHG | TRBV7-8*00(1138.4) | nan | TRBJ2-5*00(234.1) | TRBC2*00(257.6) | 273 | 285 | 310 | 0 | 12 | 60.0 | nan | 21 | 40 | 68 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 35 | 519 | 0.00144773 | TGTGCCTGGAGTGTGACAGTGATGAATCAGCCCCAGCATTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1168.3) | nan | TRBJ1-5*00(229) | TRBC1*00(256) | 270 | 287 | 304 | 0 | 18 | I284G | 73.0 | nan | 24 | 42 | 70 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 36 | 517 | 0.00144215 | TGTGCCTGGAGTACGGGACCGGCGAACACTGAAGCTTTCTTT | HHHHHHHHHHGGHGHGGGGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1163.4) | TRBD100(31),TRBD200(30) | TRBJ1-1*00(234.2) | TRBC1*00(258.6) | 270 | 282 | 304 | 0 | 12 | 60.0 | 11 | 20 | 36 | 13 | 22 | SA17C | 31.0;15 | 21 | 48 | 13 | 19 | 30.0 | 21 | 40 | 68 | 23 | 42 | |||
| 37 | 487 | 0.00135847 | TGTGCCTGGACGGTCCGACAGGCCCAGTACTTC | HHHHHHGGGGGGGGGGHHGGHHHHHHHHHHHHG | TRBV30*00(1150.9) | nan | TRBJ2-5*00(195.4) | TRBC2*00(258) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 40 | 68 | 19 | 33 | SA28G | 56.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 38 | 474 | 0.00132221 | TGTGCCTGGAGTCCGGACGGGACGCAAGAGACCCAGTACTTC | HHHHHHHHHGGGGGGGHGGGGGGHHHHHHHHHHHHHHHHHHG | TRBV30*00(1160.9) | TRBD100(30),TRBD200(30) | TRBJ2-5*00(228.9) | TRBC2*00(258.1) | 270 | 282 | 304 | 0 | 12 | 60.0 | 11 | 17 | 36 | 17 | 23 | 30.0;15 | 21 | 48 | 17 | 23 | 30.0 | 22 | 40 | 68 | 24 | 42 | ||||
| 39 | 458 | 0.00127757 | TGTGCCTGGAAGGCGGATACGCAGTATTTT | HHHHHHHHHGGGGGHHGGGGHHHHHHHHHH | TRBV30*00(1155.4) | nan | TRBJ2-3*00(214.1) | TRBC2*00(257.3) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 41 | 69 | 15 | 30 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 40 | 456 | 0.00127199 | TGTGCCAGCGTGAACAGATACCTCAACTCTGGGGCCAACGTCCTGACTTTC | HHHGGHHHHHHHHHHHHHHHHHHHHHHHGGGHHHHHGGGHHHHHHHHHHHG | TRBV6-5*00(1148) | nan | TRBJ2-6*00(264.1) | TRBC2*00(258) | 270 | 279 | 307 | 0 | 9 | 45.0 | nan | 20 | 45 | 73 | 26 | 51 | 125.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 41 | 450 | 0.00125526 | TGTGCCTGGGTCAAGATCCAGGGGGGTACTGAAGCTTTCTTT | HHHHGHHHHHHHHHHHHHGGGGHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1149.6) | TRBD1*00(35) | TRBJ1-1*00(214) | TRBC1*00(258.6) | 270 | 279 | 304 | 0 | 9 | 45.0 | 16 | 23 | 36 | 18 | 25 | 35.0 | 25 | 40 | 68 | 27 | 42 | 75.0 | nan | nan | nan | nan | nan | |||
| 42 | 411 | 0.00114647 | TGTGCCAGCAGCCAAGAGGTAAGGGGTCGGGGCCACAATCAGCCCCAGCATTTT | HHHHHHHHGHHHHHHHHHHHHGGGGGGGGHHHGHHHHHHGGGGGHHHHHHHHHH | TRBV4-2*00(1136.5) | TRBD100(26),TRBD200(25) | TRBJ1-5*00(234.5) | TRBC1*00(257) | 270 | 287 | 307 | 0 | 17 | 85.0 | 17 | 25 | 36 | 21 | 29 | SG22T | 26.0;29 | 34 | 48 | 29 | 34 | 25.0 | 23 | 42 | 70 | 35 | 54 | |||
| 43 | 407 | 0.00113531 | TGTGCCAGCAGTTACGCAGAATCCTACGAGCAGTACTTC | HHHHHHHHHHGGGGHHHHHHHHGGGGGHHHHHHHGHHHG | TRBV6-5*00(1190.3) | nan | TRBJ2-7*00(226.3) | TRBC2*00(257.7) | 270 | 285 | 307 | 0 | 15 | 75.0 | nan | 21 | 39 | 67 | 21 | 39 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 44 | 404 | 0.00112694 | TGCAGTGCTAGGAACCGGGACAGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV20-1*00(1199.4) | TRBD100(30),TRBD200(30) | TRBJ1-1*00(234.6) | TRBC1*00(256.9) | 279 | 290 | 313 | 0 | 11 | 55.0 | 10 | 16 | 36 | 14 | 20 | 30.0;14 | 20 | 48 | 14 | 20 | 30.0 | 18 | 40 | 68 | 20 | 42 | ST20G | |||
| 45 | 402 | 0.00112136 | TGTGCCTGGAGCCGCGGGACAGAAACAGATACGCAGTATTTT | HHHHHHHGGGGGGGGHHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1156.6) | TRBD1*00(40) | TRBJ2-3*00(228.7) | TRBC2*00(258.1) | 270 | 281 | 304 | 0 | 11 | 55.0 | 11 | 19 | 36 | 14 | 22 | 40.0 | 23 | 41 | 69 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | |||
| 46 | 401 | 0.00111857 | TGTGCCAGCAGCTCAGGAGAGACCCAGTACTTC | HHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHG | TRBV7-6*00(1148.8) | nan | TRBJ2-5*00(219.4) | TRBC2*00(258.7) | 273 | 289 | 310 | 0 | 16 | ST286C | 66.0 | nan | 24 | 40 | 68 | 17 | 33 | 80.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 47 | 386 | 0.00107673 | TGTGCCAGCAGCTTGGGAGGGAACACTGAAGCTTTCTTT | HHHHHHHHHHGHHHGGHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-8*00(1160.7) | nan | TRBJ1-1*00(233.6) | TRBC1*00(257.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | nan | 21 | 40 | 68 | 20 | 39 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 48 | 382 | 0.00106557 | TGTGCCAGCAGCTCCGGGGGGGCTAAGGAGCAGTTCTTC | HHHHHHHHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHG | TRBV9*00(1143.1) | TRBD200(31),TRBD100(30) | TRBJ2-1*00(199.7) | TRBC2*00(256.5) | 270 | 282 | 306 | 0 | 12 | 60.0 | 24 | 33 | 48 | 14 | 23 | SA28G | 31.0;18 | 24 | 36 | 17 | 23 | 30.0 | 27 | 42 | 70 | 24 | 39 | ST29G | ||
| 49 | 376 | 0.00104884 | TGTGCCTGGGTGCAAAAAGACGCTAGCACAGATACGCAGTATTTT | HHHHGHGGHHHHGHHHGGGGGHHHHHHHHHHGGGGHHHHHHHHHG | TRBV30*00(1154) | nan | TRBJ2-3*00(259.2) | TRBC2*00(257.5) | 270 | 283 | 304 | 0 | 12 | DA279 | 48.0 | nan | 17 | 41 | 69 | 21 | 45 | 120.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 50 | 374 | 0.00104326 | TGTGCCAGCAGCTCGAAGCGTGGGACAAACTACGAGCAGTACTTC | HHHHHHHHHGGHHGGGGGGHHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV11-2*00(1147.1) | TRBD1*00(30) | TRBJ2-7*00(219.3) | TRBC2*00(257.3) | 273 | 286 | 310 | 0 | 13 | 65.0 | 12 | 18 | 36 | 21 | 27 | 30.0 | 23 | 39 | 67 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | |||
| 51 | 368 | 0.00102652 | TGTGGCGCGGGCTCCTACAATGAGCAGTTCTTC | HGGGGGGGGGHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1121.8) | nan | TRBJ2-1*00(254.1) | TRBC2*00(258.3) | 270 | 274 | 304 | 0 | 4 | 20.0 | nan | 19 | 42 | 70 | 10 | 33 | 115.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 52 | 364 | 0.00101536 | TGTGCCAGCAGCCCTCGCCGGTCCATGAACACTGAAGCTTTCTTT | HHHHHHHGGHGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-4*00(1149.6) | nan | TRBJ1-1*00(249.7) | TRBC1*00(255.8) | 270 | 282 | 306 | 0 | 12 | 60.0 | nan | 15 | 40 | 68 | 20 | 45 | ST17C | 111.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 53 | 352 | 0.000981891 | TGCAGCGTTAAAACAGCCGCGAAAAGGTACGAGCAGTACTTC | HGHHHHHHHHHHHHHGHHGGHHHHHGHGGHHHHHHHHGGGGG | TRBV29-1*00(1212.6) | nan | TRBJ2-7*00(214.2) | TRBC2*00(258.1) | 276 | 288 | 310 | 0 | 12 | SG285A | 46.0 | nan | 24 | 39 | 67 | 27 | 42 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 54 | 347 | 0.000967943 | TGTGCCAGCAGCTTAGTGATAGGGAACACCGGGGAGCTGTTTTTT | HHHGHHHHHHHHHHHHHHHHHHHHGGGGGGGGHHHHHHGGHHHHH | TRBV7-7*00(1144.4) | nan | TRBJ2-2*00(249.4) | TRBC2*00(257.7) | 273 | 289 | 310 | 0 | 16 | 80.0 | nan | 21 | 43 | 71 | 23 | 45 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 55 | 345 | 0.000962365 | TGTGCCTGGAGAAATGGAGGGGCTCCCGGTTTTGCCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHGGHGGGGGGGGGGHHHHHHGGGGGHHHHHHHGGHGG | TRBV30*00(1161.3) | TRBD2*00(39) | TRBJ2-7*00(224.5) | TRBC2*00(257.7) | 270 | 281 | 304 | 0 | 11 | 55.0 | 26 | 40 | 48 | 15 | 28 | DC32SC33G | 39.0 | 22 | 39 | 67 | 34 | 51 | 85.0 | nan | nan | nan | nan | nan | ||
| 56 | 335 | 0.00093447 | TGTGCCAGCAGCGTAGATTGGGCCGTCGGCAATGAGCAGTTCTTC | HHHHHHHGGGGGHHHHHHGGGGGGGGGHHHHHHHHHHHHHGGHGG | TRBV9*00(1168) | TRBD2*00(26) | TRBJ2-1*00(219.4) | TRBC2*00(256.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | 29 | 37 | 48 | 19 | 27 | SC34G | 26.0 | 26 | 42 | 70 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | ||
| 57 | 334 | 0.00093168 | TGTGCCAGCAGCGTAGGCTACAGCCAAGAGACCCAGTACTTC | HHHHHHHGGGGGHHHHHHHHGHHHHHHHHHHHHHHHHHHHHH | TRBV9*00(1178.4) | nan | TRBJ2-5*00(234.3) | TRBC2*00(258.4) | 270 | 290 | 306 | 0 | 21 | I285G | 88.0 | nan | 21 | 40 | 68 | 23 | 42 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 58 | 332 | 0.000926102 | TGCAGCGTTGACGAGGGGGCAGGACAGCCCCAGCATTTT | HHGHGGHGHGHGGGGGHHHHHHHHGGGGGHHHHHHHHHH | TRBV29-1*00(1233.3) | TRBD1*00(30) | TRBJ1-5*00(214.2) | TRBC1*00(257.8) | 276 | 290 | 310 | 0 | 14 | SA287C | 56.0 | 18 | 24 | 36 | 14 | 20 | 30.0 | 27 | 42 | 70 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | ||
| 59 | 329 | 0.000917733 | TGTGCCAGCAGCTTAGAGAGGGGGACAGCTAACTATGGCTACACCTTC | HHHGHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV7-6*00(1157.9) | TRBD1*00(30) | TRBJ1-2*00(249.2) | TRBC1*00(257.4) | 273 | 289 | 310 | 0 | 16 | 80.0 | 17 | 23 | 36 | 18 | 24 | 30.0 | 18 | 40 | 68 | 26 | 48 | 110.0 | nan | nan | nan | nan | nan | |||
| 60 | 327 | 0.000912154 | TGTGCCAGCAGCGGAAACAGGGGCCGCACAGATACGCAGTATTTT | HHHHHHHGGGGGHHHHHGHGGGGGGHHHHHHGGGGHHHHHHHHHH | TRBV5-4*00(1157.4) | TRBD1*00(35) | TRBJ2-3*00(239.2) | TRBC2*00(258.1) | 270 | 282 | 306 | 0 | 12 | 60.0 | 15 | 22 | 36 | 16 | 23 | 35.0 | 21 | 41 | 69 | 25 | 45 | 100.0 | nan | nan | nan | nan | nan | |||
| 61 | 320 | 0.000892628 | TGTGCCTGGAGCAAACAGGGCACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1146.4) | TRBD1*00(30) | TRBJ1-1*00(219.3) | TRBC1*00(258.5) | 270 | 281 | 304 | 0 | 11 | 55.0 | 15 | 21 | 36 | 14 | 20 | 30.0 | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | |||
| 62 | 319 | 0.000889839 | TGTGCCTGGAGTATGGCAGGGGCAGGAAACACCATATATTTT | HHHHHHHHHHHHHHHHGHGHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1165.9) | TRBD1*00(30) | TRBJ1-3*00(228.7) | TRBC1*00(256.9) | 270 | 282 | 304 | 0 | 12 | 60.0 | 16 | 22 | 36 | 16 | 22 | 30.0 | 24 | 42 | 70 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | |||
| 63 | 315 | 0.000878681 | TGTGCCTGGAGTGTACAGGGTTTCACCCTCCACTTT | HHHHHHHHHHHHHHHHHGHHHHHGGGGHHHHHHHHH | TRBV30*00(1183.1) | nan | TRBJ1-6*00(188) | TRBC1*00(255.8) | 270 | 287 | 304 | 0 | 17 | 85.0 | nan | 29 | 45 | 73 | 21 | 36 | DC36 | 63.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 64 | 315 | 0.000878681 | TGTGCCTGGAGTAGGGAACAGAACACCGGGGAGCTGTTTTTT | HHHHHHHHHHHHHHHHHHHHHGGGGGGGGHHHHHHGGHHHHH | TRBV30*00(1166.4) | nan | TRBJ2-2*00(248.7) | TRBC2*00(257.8) | 270 | 282 | 304 | 0 | 12 | 60.0 | nan | 21 | 43 | 71 | 20 | 42 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 65 | 313 | 0.000873102 | TGTGCCTGGAAGTATGCCCCATTGACCCGGGCGAGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHGGGHHHHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1155.7) | TRBD100(35),TRBD200(30) | TRBJ1-1*00(234.1) | TRBC1*00(258.3) | 270 | 280 | 304 | 0 | 10 | 50.0 | 0 | 15 | 36 | 15 | 31 | I5ASC5TST8A | 35.0;13 | 19 | 48 | 25 | 31 | 30.0 | 21 | 40 | 68 | 35 | 54 | |||
| 66 | 302 | 0.000842418 | TGTGCCTGGAGTGACCGGACAGGGAAGGACACTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1161.5) | TRBD1*00(40) | TRBJ1-1*00(224.2) | TRBC1*00(256.4) | 270 | 283 | 304 | 0 | 13 | 65.0 | 13 | 21 | 36 | 16 | 24 | 40.0 | 23 | 40 | 68 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
| 67 | 296 | 0.000825681 | TGTGCCTTACAGAGGGCTAGGTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHHGHGGGHHHHHHHGGGGG | TRBV30*00(1138.4) | TRBD100(31),TRBD200(30) | TRBJ2-7*00(214.3) | TRBC2*00(257.3) | 270 | 277 | 304 | 0 | 7 | 35.0 | 15 | 24 | 36 | 8 | 17 | SG19A | 31.0;27 | 33 | 48 | 11 | 17 | 30.0 | 24 | 39 | 67 | 21 | 36 | |||
| 68 | 294 | 0.000820102 | TGTGCCTGGAGTGCTGGAAACTATGGCTACACCTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHH | TRBV30*00(1173.3) | nan | TRBJ1-2*00(215.1) | TRBC1*00(257.7) | 270 | 283 | 304 | 0 | 13 | 65.0 | nan | 22 | 40 | 68 | 18 | 36 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 69 | 293 | 0.000817313 | TGTGCCTGGACTGGCGGGAGCAGCACAGATACGCAGTATTTT | HHHHHHHHHHGGGGGGHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1288.1) | TRBD2*00(35) | TRBJ2-3*00(243.7) | TRBC2*00(257.8) | 270 | 283 | 304 | 0 | 13 | SG280C | 51.0 | 23 | 30 | 48 | 13 | 20 | 35.0 | 20 | 41 | 69 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | ||
| 70 | 290 | 0.000808944 | TGCAGCGGGGCCGGGAGCAGGGATTACCAAGAGACCCAGTACTTC | HGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV29-1*00(1209.4) | TRBD1*00(43) | TRBJ2-5*00(244.1) | TRBC2*00(258.4) | 276 | 283 | 310 | 0 | 7 | 35.0 | 10 | 21 | 36 | 10 | 22 | I16G | 43.0 | 19 | 40 | 68 | 24 | 45 | 105.0 | nan | nan | nan | nan | nan | ||
| 71 | 282 | 0.000786628 | TGTGCCAGCAGCGTAGAGGCCTCCACAGATACGCAGTATTTT | HHHHHHHGGGGGHHHHHGGGHHHHHHHHGGGGHHHHHHHHHH | TRBV9*00(1167.2) | nan | TRBJ2-3*00(233.8) | TRBC2*00(258) | 270 | 286 | 306 | 0 | 16 | 80.0 | nan | 22 | 41 | 69 | 23 | 42 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 72 | 280 | 0.00078105 | TGCAGCGTTGAATCGCGATCTCCGGGACTTTTT | HHGHGGHHHHHGGGGHHHGGGGGHHHHGHHHHH | TRBV29-1*00(1231) | TRBD200(40),TRBD100(35) | TRBJ1-1*00(153.9) | TRBC1*00(257.3) | 276 | 288 | 310 | 0 | 12 | 60.0 | 14 | 22 | 48 | 21 | 29 | 40.0;10 | 17 | 36 | 21 | 28 | 35.0 | 37 | 40 | 68 | 30 | 33 | ||||
| 73 | 272 | 0.000758734 | TGTGCCAGCCAGGGGAGCCATGAGCAGTTCTTC | HHHHGGHHHGHGGGHHHHHHHHHHHHHHGGGGG | TRBV19*00(1125.3) | TRBD1*00(30) | TRBJ2-1*00(209.2) | TRBC2*00(256.8) | 270 | 278 | 307 | 0 | 8 | 40.0 | 16 | 22 | 36 | 9 | 15 | 30.0 | 28 | 42 | 70 | 19 | 33 | 70.0 | nan | nan | nan | nan | nan | |||
| 74 | 271 | 0.000755944 | TGCAGCGTTGGCCCCGACTTCAATCAGCCCCAGCATTTT | HGGGGGGHGGGGGGGHHHHHHHHHGGGGGHHHHHHHHHH | TRBV29-1*00(1219.3) | TRBD1*00(25) | TRBJ1-5*00(233.8) | TRBC1*00(257.2) | 276 | 286 | 310 | 0 | 10 | 50.0 | 0 | 5 | 36 | 10 | 15 | 25.0 | 23 | 42 | 70 | 20 | 39 | 95.0 | nan | nan | nan | nan | nan | |||
| 75 | 269 | 0.000750365 | TGTGCCTGGAGTGTGGACTCCCCGGGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGHHHHHGGGGGGGHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1182.6) | nan | TRBJ1-5*00(239.1) | TRBC1*00(256.5) | 270 | 291 | 304 | 0 | 21 | SA284GSC285G | 77.0 | nan | 22 | 42 | 70 | 25 | 45 | 100.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
| 76 | 269 | 0.000750365 | TGTGCCAGCAGCCGCGGGACTCTCCTGTTCTTC | HHHGHHHGGGGGGGGHHHHGHGHHHHHHHGGGG | TRBV7-6*00(1139) | TRBD200(35),TRBD100(30) | TRBJ2-1*00(174.4) | TRBC2*00(258.8) | 273 | 285 | 310 | 0 | 12 | 60.0 | 15 | 22 | 48 | 14 | 21 | 35.0;11 | 17 | 36 | 14 | 20 | 30.0 | 35 | 42 | 70 | 26 | 33 | ||||
| 77 | 267 | 0.000744786 | TGTGCCAGCAGTTACGGCGGGGGGCCCTCCTACGAGCAGTACTTC | HHHGHHHHHHGHGGHGGGGGGGGGGGHHGGGGGHHHHHHHGGGGG | TRBV6-5*00(1178.4) | TRBD2*00(38) | TRBJ2-7*00(234) | TRBC2*00(252.6) | 270 | 285 | 307 | 0 | 15 | 75.0 | 23 | 34 | 48 | 16 | 26 | DA28 | 38.0 | 20 | 39 | 67 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | ||
| 78 | 266 | 0.000741997 | TGTGCCTGGGCCACGGTACGGACGGGAGACACTGAAGCTTTCTTT | HHHHGHGHHGGHGGGGHHGGGHGHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1154.1) | TRBD2*00(30) | TRBJ1-1*00(224) | TRBC1*00(256.9) | 270 | 279 | 304 | 0 | 9 | 45.0 | 24 | 30 | 48 | 22 | 28 | 30.0 | 23 | 40 | 68 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
| 79 | 263 | 0.000733629 | TGTGCCGTGAAGGGTAGCGGGAGATACGAGCAGTACTTC | HGGGGHHHHHHHHGGGGGHHHHHHGGGHHHHHHHGGHGG | TRBV30*00(1150.1) | TRBD2*00(35) | TRBJ2-7*00(213.9) | TRBC2*00(258.1) | 270 | 285 | 304 | 0 | 16 | I276GSG278AST281G | 35.0 | 23 | 30 | 48 | 16 | 23 | 35.0 | 24 | 39 | 67 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | ||
| 80 | 261 | 0.00072805 | TGTGCCTGGAGTCTGCTAGCGGGAGGGAACAATGAGCAGTTCTTC | HHHHHHHHHHHHHHHGGGGGGGHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1161.7) | TRBD2*00(60) | TRBJ2-1*00(223.4) | TRBC2*00(258.3) | 270 | 282 | 304 | 0 | 12 | 60.0 | 20 | 32 | 48 | 15 | 27 | 60.0 | 25 | 42 | 70 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
| 81 | 261 | 0.00072805 | TGTGCCTGGAGTCCCAGGATGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1157.6) | nan | TRBJ1-1*00(244.3) | TRBC1*00(258.2) | 270 | 282 | 304 | 0 | 12 | 60.0 | nan | 19 | 40 | 68 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 82 | 259 | 0.000722471 | TGTGCCTGGAGCGGGACAGGGATCAATGAGCAGTTCTTC | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHGGHGG | TRBV30*00(1154.6) | TRBD1*00(50) | TRBJ2-1*00(218.5) | TRBC2*00(258.4) | 270 | 281 | 304 | 0 | 11 | 55.0 | 11 | 21 | 36 | 11 | 21 | 50.0 | 26 | 42 | 70 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | |||
| 83 | 256 | 0.000714102 | TGTGCCTGGAGTGGCGGACTGAGTAGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGGGGGHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1257.6) | TRBD2*00(25) | TRBJ1-5*00(248.2) | TRBC1*00(257.9) | 270 | 283 | 304 | 0 | 13 | 65.0 | 17 | 22 | 48 | 15 | 20 | 25.0 | 20 | 42 | 70 | 23 | 45 | 110.0 | nan | nan | nan | nan | nan | |||
| 84 | 255 | 0.000711313 | TGTGCCTGGAGTACCCTCGACAGGGCGAACTATGGCTACACCTTC | HHHHHHHHHHHHGGGGHGHHHGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1165.5) | TRBD1*00(30) | TRBJ1-2*00(216.5) | TRBC1*00(258) | 270 | 282 | 304 | 0 | 12 | 60.0 | 14 | 20 | 36 | 18 | 24 | 30.0 | 19 | 40 | 68 | 24 | 45 | ST21G | 91.0 | nan | nan | nan | nan | nan | ||
| 85 | 254 | 0.000708523 | TGTGCCTGGAGGGACGCTTCTGGAAACACCATATATTTT | HHHHHHHHHHGGGGGHHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1158) | nan | TRBJ1-3*00(244.6) | TRBC1*00(258.5) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 21 | 42 | 70 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 86 | 254 | 0.000708523 | TGTGCCTGGAGACGACAGGGGGCTAACTATGGCTACACCTTC | HHHHHHHHHHGGGHHGGGGHHHHHHHHHHHHHHHHHHGGHHH | TRBV30*00(1163.9) | TRBD1*00(40) | TRBJ1-2*00(243.9) | TRBC1*00(258.1) | 270 | 281 | 304 | 0 | 11 | 55.0 | 14 | 22 | 36 | 13 | 21 | 40.0 | 19 | 40 | 68 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | |||
| 87 | 251 | 0.000700155 | TGTGCCAGCAGCTTGGGGGGGGGTACGGAAGAGCAGTTCTTC | HHHHHHHHHHGHGGGGGGGHHGGHHGHHHHHHHHHHHHHHHG | TRBV5-6*00(1166.5) | TRBD1*00(25) | TRBJ2-1*00(199.1) | TRBC2*00(257.6) | 270 | 286 | 306 | 0 | 16 | 80.0 | 18 | 23 | 36 | 16 | 21 | 25.0 | 30 | 42 | 70 | 30 | 42 | 60.0 | nan | nan | nan | nan | nan | |||
| 88 | 250 | 0.000697366 | TGTGCCAGCAGTTACTCGAAGGTTTCAGACCCCGGACCTGGAAACACCATATATTTT | HHHGHHHHHHHHHHHHHHHHHHHHHHHGGGGGGHHHHHHHHHGHHHHHHHHHHHHHH | TRBV6-5*00(1206.9) | TRBD200(27),TRBD100(26) | TRBJ1-3*00(239) | TRBC1*00(257.1) | 270 | 289 | 307 | 0 | 19 | 95.0 | 10 | 21 | 48 | 26 | 37 | ST12ASG16C | 27.0;9 | 17 | 36 | 29 | 37 | SG12C | 26.0 | 22 | 42 | 70 | 37 | 57 | ||
| 89 | 249 | 0.000694576 | TGTGCCTGCCCAAGAGACAGGGTCTATGGCTACACCTTC | HHHGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1142.4) | TRBD1*00(35) | TRBJ1-2*00(219.1) | TRBC1*00(257.5) | 270 | 278 | 304 | 0 | 8 | 40.0 | 14 | 21 | 36 | 15 | 22 | 35.0 | 24 | 40 | 68 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | |||
| 90 | 245 | 0.000683418 | TGTGCCAGCAGCTTATATTACAGGGTTGGGGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHGGGGGGHHHHHHHHHHH | TRBV5-8*00(1131.2) | TRBD1*00(30) | TRBJ1-1*00(184.3) | TRBC1*00(257.8) | 270 | 284 | 306 | 0 | 14 | 70.0 | 15 | 21 | 36 | 19 | 25 | 30.0 | 31 | 40 | 68 | 30 | 39 | 45.0 | nan | nan | nan | nan | nan | |||
| 91 | 244 | 0.000680629 | TGTGCCTGGAGTGTATCGGCATCTGGAAACACCATATATTTT | HHHHHHHHHHHHGGGGGHHHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1173.4) | nan | TRBJ1-3*00(243.4) | TRBC1*00(257) | 270 | 285 | 304 | 0 | 15 | 75.0 | nan | 21 | 42 | 70 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 92 | 241 | 0.00067226 | TGTGCCAGTAGTCCGGGCATTTCCTACGAGCAGTACTTC | HHHHHHHHHGGGGGHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV19*00(1143.2) | TRBD100(25),TRBD200(25) | TRBJ2-7*00(229.4) | TRBC2*00(257.9) | 270 | 282 | 307 | 0 | 12 | 60.0 | 10 | 15 | 36 | 12 | 17 | 25.0;14 | 19 | 48 | 12 | 17 | 25.0 | 21 | 39 | 67 | 21 | 39 | ||||
| 93 | 241 | 0.00067226 | TGCAGCGTGCTGGACAGGGGGTTGGACAATGAGCAGTTCTTC | HGGHGGHHHHHHHHGGGGGGHHHHHHHHHHHHHHHHHGGHGG | TRBV29-1*00(1214.6) | TRBD1*00(50) | TRBJ2-1*00(223.6) | TRBC2*00(258) | 276 | 284 | 310 | 0 | 8 | 40.0 | 13 | 23 | 36 | 11 | 21 | 50.0 | 25 | 42 | 70 | 25 | 42 | 85.0 | nan | nan | nan | nan | nan | |||
| 94 | 235 | 0.000655524 | TGCCTTCAGAGCCACATGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV20-1*00(1153.2) | nan | TRBJ1-1*00(249.5) | TRBC1*00(258.1) | 279 | 282 | 313 | 0 | 3 | 15.0 | nan | 18 | 40 | 68 | 14 | 36 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 95 | 234 | 0.000652734 | TGCAGCGTTGAAGCTTCGGCAGGGCAAAAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHH | TRBV29-1*00(1223.9) | TRBD2*00(31) | TRBJ1-1*00(194.1) | TRBC1*00(257.7) | 276 | 289 | 310 | 0 | 13 | 65.0 | 24 | 33 | 48 | 16 | 25 | SG27C | 31.0 | 29 | 40 | 68 | 28 | 39 | 55.0 | nan | nan | nan | nan | nan | ||
| 96 | 228 | 0.000635997 | TGTGCCATCAGTGAGTTGGGGGGGCTCGTTTTAAAGGCTTCCGAAGCTTTCTTT | HHHHHHHHHHHHHHGGGGGGGGGGGGGHHHHHHHHHHGGGGGHHHHHHHHHHHH | TRBV10-3*00(1143.7) | TRBD100(30),TRBD200(26) | TRBJ1-1*00(198.7) | TRBC1*00(256.8) | 270 | 286 | 307 | 0 | 16 | 80.0 | 18 | 24 | 36 | 19 | 25 | 30.0;25 | 33 | 48 | 17 | 25 | SA28G | 26.0 | 28 | 40 | 68 | 42 | 54 | |||
| 97 | 227 | 0.000633208 | TGCAGCGTTGAGGCGGGACGCCGTTACAATGAGCAGTTCTTC | HHHHHHHHHGHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHG | TRBV29-1*00(1219.9) | TRBD100(30),TRBD200(30) | TRBJ2-1*00(229.1) | TRBC2*00(258.2) | 276 | 287 | 310 | 0 | 11 | 55.0 | 11 | 17 | 36 | 13 | 19 | 30.0;15 | 21 | 48 | 13 | 19 | 30.0 | 24 | 42 | 70 | 24 | 42 | ||||
| 98 | 226 | 0.000630419 | TGTGCCAGCAGCGATGACAGCTCCTACGAGCAGTACTTC | HHHHHHHHGHGHHHHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV14*00(1146.8) | nan | TRBJ2-7*00(244) | TRBC2*00(258.3) | 273 | 285 | 310 | 0 | 12 | 60.0 | nan | 18 | 39 | 67 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
| 99 | 225 | 0.000627629 | TGTGCCTGGAGCGGAAGCACTGAAGCTTTCTTT | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1162.7) | nan | TRBJ1-1*00(222.1) | TRBC1*00(254.9) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 21 | 40 | 68 | 13 | 33 | I24G | 83.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan |
Quality control
Now when we have all files processed lets perform Quality Control. The first thing to check is the alignment rate. That can be easily done using mixcr exportQc align function.
shell mixcr exportQc align results/*.clns figs/alignQc.pdf
From this plot we can clearly see some issues with the libraries. A lot of the samples have a relatively big fraction of not aligned reads primarily due to the absence of J hits.
MiXCR is a powerful tool that allows us to investigate further. Let's pick one of the samples where the issue is most obvious. (ex. CRC00308 ). To look at the reads' alignments for that sample we first will run mixcr align command for that sample once again, but this time we will specify additional options - -OallowPartialAlignments=true -OallowNoCDR3PartAlignments=true, that will preserve partially aligned reads (ex. reads that may lack J gene) and reads that lack CDR3 sequence.
shell mkdir -p debug mixcr align \ -s hsa \ -p generic-amplicon \ -OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP" \ -OvParameters.parameters.floatingLeftBound=true \ -OjParameters.parameters.floatingRightBound=false \ -OcParameters.parameters.floatingRightBound=true \ -OallowPartialAlignments=true \ -OallowNoCDR3PartAlignments=true \ raw/CRC003_therapy-08_R1.fastq.gz raw/CRC003_therapy-08_R2.fastq.gz \ debug/CRC003_therapy-08_debug.vdjca
Now we can look at raw alignments itself using mixcr exportAlignmentsPretty.
The function bellow will generate a .txt human-readable file with alignments. We use parameter --skip 1000 to skip first 1000 reads, as first reads usually have bad quality, and --limit 100 will export only 100 alignments as we usually don't need to examine every alignment to see the issue.
shell mixcr exportAlignmentsPretty --skip 1000 \ --limit 100 \ debug/CRC003_therapy-08_debug.vdjca \ debug/CRC003_therapy-08_debug.alignments.txt
Bellow you can see a few alignments from the generated file. The first one is an example of well aligned reads.
```shell
Read ids: 1113
FR1><CDR1 CDR1><FR2
_ E K P V T L S C S Q T L N H N V M Y W Y Q Q K S S Q
Quality 66666767777777777777777777777777777777777777777777777777777777777777777777777777 Target0 0 AGAAAAGCCAGTGACCCTGAGTTGTTCTCAGACTTTGAACCATAACGTCATGTACTGGTACCAGCAGAAGTCAAGTCAGG 79 Score TRBV15*00 104 aaagccagtgaccctgagttgttctcagactttgaaccataacgtcatgtactggtaccagcagaagtcaagtcagg 180 1148
FR2><CDR2 CDR2><FR3
A P K L L F H Y Y D K D F N N E A D T P D N F Q S R R
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777 Target0 80 CCCCAAAGCTGCTGTTCCACTACTATGACAAAGATTTTAACAATGAAGCAGACACCCCTGATAACTTCCAATCCAGGAGG 159 Score TRBV15*00 181 ccccaaagctgctgttccactactatgacaaagattttaacaatgaagcagacacccctgataacttccaatccaggagg 260 1148
FR3><CDR3 V
P N I S F C F L D I R S P G L G D A A M Y L C A T S G
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777 Target0 160 CCGAACATTTCTTTCTGCTTTCTTGACATCCGCTCACCAGGCCTGGGGGACGCAGCCATGTACCTGTGTGCCACCAGCGG 239 Score TRBV15*00 261 ccgaacaCttctttctgctttcttgacatccgctcaccaggcctgggggacAcagccatgtacctgtgtgccaccagcAg 340 1148
> <J CDR3><FR4 FR4><C
L G D T Q Y F G P G T R L T V L E D L K N V F P P E
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777 Target0 240 ACTAGGGGATACGCAGTATTTTGGCCCAGGCACCCGGCTGACAGTGCTCGAGGACCTGAAAAACGTGTTCCCACCCGAGG 319 Score TRBV1500 341 a 341 1148 TRBJ2-300 26 gatacgcagtattttggcccaggcacccggctgacagtgctcg 68 215 TRBC2*00 0 aggacctgaaaaacgtgttcccacccgagg 29 260
V A V F E P S D S Q _
Quality 7777777777777777777777777766666 Target0 320 TCGCTGTGTTTGAGCCATCAGATAGTCAATG 350 Score TRBC2*00 30 tcgctgtgtttgagccatcaga 51 260 ```
Now, the following pair of reads failed to align.
```shell
Read ids: 1115
_ N P V G L R C Y P T S V F F C V Y L Y Q Q K P F P C
Quality 33553553353536366363363333633322336733363663633333633633733363222632222333655322 Target0 0 CAAACCCCGTGGGGCTGAGGTGCTACCCAACCTCTGTCTTTTTCTGTGTGTACTTGTACCAACAAAAACCCTTCCCCTGC 79 Score
P G S P S K N Y Q A E G G G D G E G E E G V S G G R R
Quality 62222252552522333333633336333225522622222222252522525222222222222233522525222522 Target0 80 CCCGGGTCCCCCAGTAAGAATTATCAGGCCGAAGGGGGAGGAGACGGGGAGGGGGAAGAAGGAGTATCCGGGGGGCGGCG 159 Score
FR3><CDR3
V V L K K K L N L S S L E L V D S A L Y F C A S V G
Quality 22242225222672224452252522242225242626256626244262222424525222625452225222252225 Target0 160 CGTTGTCTTAAAAAAAAAACTAAACCTGAGCTCTCTGGAGCTGGTGGACTCAGCTTTGTATTTCTGTGCCAGCGTCGGGT 239 Score TRBV9*00 280 aactaaacctgagctctctggagctggGggactcagctttgtatttctgtgccagc--AgCgt 340 200
V><VP VP>
S H H I
Quality 242 22 252224 Target0 240 CGC-AC-CACATA 250 Score TRBV9*00 341 AgcTacGcTGCtG 353 200
<CDR2 CDR2><FR3
Y H K G E E R A K G N I L E R F S A Q Q F P D F L F F
Quality 22224664272772777762664467626552222267767777765422666277777777766636665226665653 Target1 0 TATCATAAAGGAGAAGAGAGAGCAAAAGGAAACATTCTTGAACGATTCTCCGCACAACAGTTCCCTGACTTTCTTTTTTT 79 Score TRBV9*00 201 tatTataaTggagaagagagagcaaaaggaaacattcttgaacgattctccgcacaacagttccctgactt 271 327
Q A E D G I R H R S R H S C * T A L P I * T P V E L
Quality 27773753566265533552552226776565636377677777752777762626237777377772777667667733 Target1 80 TCAAGCAGAAGACGGCATACGACATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATTTGAACCCCTGTGGAGCTGA 159 Score
R C N Y S S S V S V Y L F * Y P E P * P C R V P A E R
Quality 67777776777762277637766776776776667776676776636666776636366767337776636766636367 Target1 160 GGTGCAACTACTCATCGTCTGTTTCAGTGTATCTCTTCTGATATCCAGAACCCTGACCCTGCCGTGTACCAGCTGAGAGA 239 Score
L S Q _
Quality 66776666366 Target1 240 CTTAGTCAGCC 250 Score ```
One can use BLAST and search for the not aligned parts of sequence in order to find out its origin.
Another quality report we should investigate is a chain abundance plot.
shell mixcr exportQc chainUsage \ results/*.clns \ figs/chainUsage.pdf
Most of the samples are more or less equally consist of TRA and TRB chains. Although we can also see a few samples (ex. CRC013_therapy-02) with no TRA chain wich might indicate some cDNA library reparation issues and needs further investigation.
Full-length clonotype assembly
iRepertoire protocol allows to recover a broader BCR receptor sequence then just CDR3 region. According to the protocol, forward primers are located in FR1 region, thus we can safely use an assembling feature that starts from CDR1 and be sure that no primers will affect the original sequence. The reverse primers are located in FR4 region very close to CDR3, thus there is not much left from to include in clone assembly.
Taking into account what is mentioned above, the longest possible assembling feature for this protocol is "{CDR1Begin:CDR3End}".
MiXCR has a specific preset to obtain full-length BCR clones with Biomed2 protocol:
shell mixcr analyze irepertoire-human-rna-xcr-repseq-lr \ raw/CRC016_preTherapy_R1.fastq.gz \ raw/CRC016_preTherapy_R2.fastq.gz \ results/CRC016_preTherapy
The mixcr assemble step in this preset differs from the one above in the following manner:
shell mixcr assemble \ -OassemblingFeatures="{CDR1Begin:CDR3End}" \ -OseparateByJ=true \ --report results/SRR8365468_HIP2_male.report \ --json-report results/SRR8365468_HIP2_male.json \ results/CRC016_preTherapy.vdjca \ results/CRC016_preTherapy.clns
-OassemblingFeatures="{CDR1Begin:CDR3End}"- sets the assembling feature to the region which starts from
CDR1Beginand ends at the end ofCDR3.
Reports
Finally, MiXCR provides a very convenient way to look at the reports generated at ech step. Every .vdjca, .clns and .clna file holds all the reports for every MiXCR function that has been applied to this sample. E.g. in our case .clns file contains reports for mixcr align and mixcr assemble. To output this report use mixcr exportReports as shown bellow. Note --json parameter will output a JSON-formatted report.
shell mixcr exportReports \ results/CRC016_preTherapy.clns \ figs/CRC016_preTherapy.report.txt
shell mixcr exportReports \ --json \ results/CRC016_preTherapy.clns \ figs/CRC016_preTherapy.report.json
Show report file
shell ============== Align Report ============== Input file(s): /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz,/raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz Output file(s): results-trimmed/CRC016_preTherapy.vdjca Version: 4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0 Command line arguments: --report results-trimmed/CRC016_preTherapy.align.report.txt --json-report results-trimmed/CRC016_preTherapy.align.report.json --preset local:irepertoire-human-tcr-lr-cdr3 /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz /raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz results-trimmed/CRC016_preTherapy.vdjca Analysis time: 0ns Total sequencing reads: 409090 Successfully aligned reads: 386658 (94.52%) Paired-end alignment conflicts eliminated: 3486 (0.85%) Alignment failed, no hits (not TCR/IG?): 4045 (0.99%) Alignment failed because of absence of V hits: 29 (0.01%) Alignment failed because of absence of J hits: 17953 (4.39%) No target with both V and J alignments: 203 (0.05%) Alignment failed because of low total score: 202 (0.05%) Overlapped: 392724 (96%) Overlapped and aligned: 371487 (90.81%) Alignment-aided overlaps: 722 (0.19%) Overlapped and not aligned: 21237 (5.19%) No CDR3 parts alignments, percent of successfully aligned: 94 (0.02%) Partial aligned reads, percent of successfully aligned: 575 (0.15%) V gene chimeras: 481 (0.12%) TRA chains: 370 (0.1%) TRA non-functional: 57 (15.41%) TRB chains: 386282 (99.9%) TRB non-functional: 8075 (2.09%) TRD chains: 5 (0%) TRD non-functional: 0 (0%) IGH chains: 1 (0%) IGH non-functional: 0 (0%) Realigned with forced non-floating bound: 34176 (8.35%) Realigned with forced non-floating right bound in left read: 9768 (2.39%) Realigned with forced non-floating left bound in right read: 9768 (2.39%) ============== Assemble Report ============== Input file(s): results-trimmed/CRC016_preTherapy.vdjca Output file(s): results-trimmed/CRC016_preTherapy.clns Version: 4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0 Command line arguments: --report results-trimmed/CRC016_preTherapy.assemble.report.txt --json-report results-trimmed/CRC016_preTherapy.assemble.report.json results-trimmed/CRC016_preTherapy.vdjca results-trimmed/CRC016_preTherapy.clns Analysis time: 0ns Final clonotype count: 24390 Average number of reads per clonotype: 14.7 Reads used in clonotypes, percent of total: 358492 (87.63%) Reads used in clonotypes before clustering, percent of total: 379951 (92.88%) Number of reads used as a core, percent of used: 377014 (99.23%) Mapped low quality reads, percent of used: 2937 (0.77%) Reads clustered in PCR error correction, percent of used: 21459 (5.65%) Reads pre-clustered due to the similar VJC-lists, percent of used: 76 (0.02%) Reads dropped due to the lack of a clone sequence, percent of total: 2924 (0.71%) Reads dropped due to a too short clonal sequence, percent of total: 143 (0.03%) Reads dropped due to low quality, percent of total: 1 (0%) Reads dropped due to failed mapping, percent of total: 3571 (0.87%) Reads dropped with low quality clones, percent of total: 1 (0%) Clonotypes eliminated by PCR error correction: 9372 Clonotypes dropped as low quality: 1 Clonotypes pre-clustered due to the similar VJC-lists: 69 TRB chains: 24390 (100%) TRB non-functional: 1117 (4.58%)
json { "type": "alignerReport", "commandLine": "--report results-trimmed/CRC016_preTherapy.align.report.txt --json-report results-trimmed/CRC016_preTherapy.align.report.json --preset local:irepertoire-human-tcr-lr-cdr3 /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz /raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz results-trimmed/CRC016_preTherapy.vdjca", "inputFiles": [ "/raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz", "/raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz" ], "outputFiles": [ "results-trimmed/CRC016_preTherapy.vdjca" ], "version": "4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0", "trimmingReport": null, "totalReadsProcessed": 409090, "aligned": 386658, "notAligned": 22432, "notAlignedReasons": { "NoCDR3Parts": 0, "NoBarcode": 0, "LowTotalScore": 202, "NoHits": 4045, "VAndJOnDifferentTargets": 203, "NoVHits": 29, "NoJHits": 17953 }, "chimeras": 0, "overlapped": 392724, "alignmentAidedOverlaps": 722, "overlappedAligned": 371487, "overlappedNotAligned": 21237, "pairedEndAlignmentConflicts": 3486, "vChimeras": 481, "jChimeras": 0, "chainUsage": { "type": "chainUsage", "chimeras": 0, "total": 386658, "chains": { "TRA": { "total": 370, "nonFunctional": 57, "isOOF": 41, "hasStops": 16 }, "TRB": { "total": 386282, "nonFunctional": 8075, "isOOF": 6763, "hasStops": 1312 }, "TRD": { "total": 5, "nonFunctional": 0, "isOOF": 0, "hasStops": 0 }, "IGH": { "total": 1, "nonFunctional": 0, "isOOF": 0, "hasStops": 0 } } }, "realignedWithForcedNonFloatingBound": 34176, "realignedWithForcedNonFloatingRightBoundInLeftRead": 9768, "realignedWithForcedNonFloatingLeftBoundInRightRead": 9768, "noCDR3PartsAlignments": 94, "partialAlignments": 575, "tagReport": null } { "type": "assemblerReport", "commandLine": "--report results-trimmed/CRC016_preTherapy.assemble.report.txt --json-report results-trimmed/CRC016_preTherapy.assemble.report.json results-trimmed/CRC016_preTherapy.vdjca results-trimmed/CRC016_preTherapy.clns", "inputFiles": [ "results-trimmed/CRC016_preTherapy.vdjca" ], "outputFiles": [ "results-trimmed/CRC016_preTherapy.clns" ], "version": "4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0", "preCloneAssemblerReport": null, "totalReadsProcessed": 409090, "initialClonesCreated": 33832, "readsDroppedNoTargetSequence": 2924, "readsDroppedTooShortClonalSequence": 143, "readsDroppedLowQuality": 68, "coreReads": 377014, "readsDroppedFailedMapping": 3571, "lowQualityRescued": 2937, "clonesClustered": 9372, "readsClustered": 21459, "clones": 24390, "clonesDroppedAsLowQuality": 1, "clonesPreClustered": 69, "readsPreClustered": 76, "readsInClones": 358492, "readsInClonesBeforeClustering": 379951, "readsDroppedWithLowQualityClones": 1, "clonalChainUsage": { "type": "chainUsage", "chimeras": 0, "total": 24390, "chains": { "TRB": { "total": 24390, "nonFunctional": 1117, "isOOF": 905, "hasStops": 212 } } } }