iRepertoire
Here we will discuss how to process the data obtained using iRepertoire TCR LR kit. This is a multiplex protocol designed in such a way that forward primers are located in FR1 region of V gene and reverse primers are complimentary to constant region.
Data libraries
This tutorial uses the data from the following publication: Longitudinal High-Throughput Sequencing of the T-Cell Receptor Repertoire Reveals Dynamic Change and Prognostic Significance of Peripheral Blood TCR Diversity in Metastatic Colorectal Cancer During Chemotherapy Yi-Tung Chen et al., Front. Immunol., 2022 Jan;12:743448 doi: 10.3389/fimmu.2021.743448
A total of 36 subjects, including 20 healthy controls and 16 metastatic CRC patients, were enrolled in this study.Peripheral blood samples were obtained from 20 age-matched healthy controls (62.6 ± 10.48 years old) and 16 CRC patients (62.38 ± 12.62 years old) before therapy. Among the 16 CRC patients, 67 peripheral blood samples were collected from 13 patients with follow-up every two months for approximately 98 to 452 days. 103 samples in total. Peripheral blood mononuclear cells (PBMCs) were isolated following the standard procedure, and total RNA from PBMCs was extracted using TRIzol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer’s protocol. A multiplex PCR amplification reaction was used to amplify the TCR immune repertoire. Human TCRα and TCRβ libraries were prepared using the HTAI-M and HTBI-M Kits (iRepertoire, Inc.) according to the manufacturer’s instructions and 2 × 250 bp paired-end sequenced was performed on the Illumina MiSeq platform.
All data is available from SRA (PRJNA754274) using e.g. SRA Explorer.
Use aria2c for efficient download of the full dataset with the proper filenames:
mkdir -p raw
aria2c -c -s 16 -x 16 -k 1M -j 8 -i download-list.txt
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365468/SRR8365468_1.fastq.gz
out=raw/SRR8365468_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365468/SRR8365468_2.fastq.gz
out=raw/SRR8365468_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365457/SRR8365457_1.fastq.gz
out=raw/SRR8365457_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365457/SRR8365457_2.fastq.gz
out=raw/SRR8365457_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365458/SRR8365458_1.fastq.gz
out=raw/SRR8365458_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365458/SRR8365458_2.fastq.gz
out=raw/SRR8365458_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365459/SRR8365459_1.fastq.gz
out=raw/SRR8365459_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365459/SRR8365459_2.fastq.gz
out=raw/SRR8365459_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365463/SRR8365463_1.fastq.gz
out=raw/SRR8365463_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365463/SRR8365463_2.fastq.gz
out=raw/SRR8365463_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365469/SRR8365469_1.fastq.gz
out=raw/SRR8365469_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365469/SRR8365469_2.fastq.gz
out=raw/SRR8365469_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365465/SRR8365465_1.fastq.gz
out=raw/SRR8365465_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365465/SRR8365465_2.fastq.gz
out=raw/SRR8365465_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365467/SRR8365467_1.fastq.gz
out=raw/SRR8365467_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365467/SRR8365467_2.fastq.gz
out=raw/SRR8365467_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365464/SRR8365464_1.fastq.gz
out=raw/SRR8365464_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365464/SRR8365464_2.fastq.gz
out=raw/SRR8365464_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365450/SRR8365450_1.fastq.gz
out=raw/SRR8365450_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365450/SRR8365450_2.fastq.gz
out=raw/SRR8365450_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365461/SRR8365461_1.fastq.gz
out=raw/SRR8365461_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365461/SRR8365461_2.fastq.gz
out=raw/SRR8365461_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365462/SRR8365462_1.fastq.gz
out=raw/SRR8365462_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365462/SRR8365462_2.fastq.gz
out=raw/SRR8365462_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365456/SRR8365456_1.fastq.gz
out=raw/SRR8365456_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365456/SRR8365456_2.fastq.gz
out=raw/SRR8365456_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365475/SRR8365475_1.fastq.gz
out=raw/SRR8365475_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365475/SRR8365475_2.fastq.gz
out=raw/SRR8365475_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365460/SRR8365460_1.fastq.gz
out=raw/SRR8365460_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365460/SRR8365460_2.fastq.gz
out=raw/SRR8365460_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365470/SRR8365470_1.fastq.gz
out=raw/SRR8365470_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365470/SRR8365470_2.fastq.gz
out=raw/SRR8365470_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365471/SRR8365471_1.fastq.gz
out=raw/SRR8365471_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365471/SRR8365471_2.fastq.gz
out=raw/SRR8365471_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365473/SRR8365473_1.fastq.gz
out=raw/SRR8365473_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365473/SRR8365473_2.fastq.gz
out=raw/SRR8365473_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365449/SRR8365449_1.fastq.gz
out=raw/SRR8365449_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365449/SRR8365449_2.fastq.gz
out=raw/SRR8365449_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365482/SRR8365482_1.fastq.gz
out=raw/SRR8365482_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365482/SRR8365482_2.fastq.gz
out=raw/SRR8365482_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365446/SRR8365446_1.fastq.gz
out=raw/SRR8365446_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365446/SRR8365446_2.fastq.gz
out=raw/SRR8365446_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365483/SRR8365483_1.fastq.gz
out=raw/SRR8365483_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365483/SRR8365483_2.fastq.gz
out=raw/SRR8365483_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365447/SRR8365447_1.fastq.gz
out=raw/SRR8365447_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365447/SRR8365447_2.fastq.gz
out=raw/SRR8365447_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365484/SRR8365484_1.fastq.gz
out=raw/SRR8365484_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365484/SRR8365484_2.fastq.gz
out=raw/SRR8365484_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365448/SRR8365448_1.fastq.gz
out=raw/SRR8365448_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365448/SRR8365448_2.fastq.gz
out=raw/SRR8365448_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365424/SRR8365424_1.fastq.gz
out=raw/SRR8365424_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365424/SRR8365424_2.fastq.gz
out=raw/SRR8365424_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365485/SRR8365485_1.fastq.gz
out=raw/SRR8365485_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365485/SRR8365485_2.fastq.gz
out=raw/SRR8365485_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365488/SRR8365488_1.fastq.gz
out=raw/SRR8365488_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365488/SRR8365488_2.fastq.gz
out=raw/SRR8365488_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365421/SRR8365421_1.fastq.gz
out=raw/SRR8365421_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365421/SRR8365421_2.fastq.gz
out=raw/SRR8365421_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365489/SRR8365489_1.fastq.gz
out=raw/SRR8365489_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365489/SRR8365489_2.fastq.gz
out=raw/SRR8365489_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365490/SRR8365490_1.fastq.gz
out=raw/SRR8365490_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365490/SRR8365490_2.fastq.gz
out=raw/SRR8365490_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365246/SRR8365246_1.fastq.gz
out=raw/SRR8365246_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365246/SRR8365246_2.fastq.gz
out=raw/SRR8365246_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365474/SRR8365474_1.fastq.gz
out=raw/SRR8365474_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365474/SRR8365474_2.fastq.gz
out=raw/SRR8365474_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365422/SRR8365422_1.fastq.gz
out=raw/SRR8365422_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365422/SRR8365422_2.fastq.gz
out=raw/SRR8365422_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365423/SRR8365423_1.fastq.gz
out=raw/SRR8365423_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365423/SRR8365423_2.fastq.gz
out=raw/SRR8365423_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365420/SRR8365420_1.fastq.gz
out=raw/SRR8365420_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365420/SRR8365420_2.fastq.gz
out=raw/SRR8365420_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365419/SRR8365419_1.fastq.gz
out=raw/SRR8365419_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365419/SRR8365419_2.fastq.gz
out=raw/SRR8365419_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365248/SRR8365248_1.fastq.gz
out=raw/SRR8365248_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365248/SRR8365248_2.fastq.gz
out=raw/SRR8365248_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365249/SRR8365249_1.fastq.gz
out=raw/SRR8365249_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365249/SRR8365249_2.fastq.gz
out=raw/SRR8365249_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365247/SRR8365247_1.fastq.gz
out=raw/SRR8365247_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365247/SRR8365247_2.fastq.gz
out=raw/SRR8365247_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365250/SRR8365250_1.fastq.gz
out=raw/SRR8365250_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365250/SRR8365250_2.fastq.gz
out=raw/SRR8365250_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365348/SRR8365348_1.fastq.gz
out=raw/SRR8365348_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365348/SRR8365348_2.fastq.gz
out=raw/SRR8365348_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365251/SRR8365251_1.fastq.gz
out=raw/SRR8365251_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365251/SRR8365251_2.fastq.gz
out=raw/SRR8365251_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365418/SRR8365418_1.fastq.gz
out=raw/SRR8365418_HIP3_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365418/SRR8365418_2.fastq.gz
out=raw/SRR8365418_HIP3_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365310/SRR8365310_1.fastq.gz
out=raw/SRR8365310_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365310/SRR8365310_2.fastq.gz
out=raw/SRR8365310_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365252/SRR8365252_1.fastq.gz
out=raw/SRR8365252_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365252/SRR8365252_2.fastq.gz
out=raw/SRR8365252_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365308/SRR8365308_1.fastq.gz
out=raw/SRR8365308_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365308/SRR8365308_2.fastq.gz
out=raw/SRR8365308_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365309/SRR8365309_1.fastq.gz
out=raw/SRR8365309_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365309/SRR8365309_2.fastq.gz
out=raw/SRR8365309_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365253/SRR8365253_1.fastq.gz
out=raw/SRR8365253_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365253/SRR8365253_2.fastq.gz
out=raw/SRR8365253_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365307/SRR8365307_1.fastq.gz
out=raw/SRR8365307_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365307/SRR8365307_2.fastq.gz
out=raw/SRR8365307_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365259/SRR8365259_1.fastq.gz
out=raw/SRR8365259_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365259/SRR8365259_2.fastq.gz
out=raw/SRR8365259_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365258/SRR8365258_1.fastq.gz
out=raw/SRR8365258_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365258/SRR8365258_2.fastq.gz
out=raw/SRR8365258_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365306/SRR8365306_1.fastq.gz
out=raw/SRR8365306_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365306/SRR8365306_2.fastq.gz
out=raw/SRR8365306_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365305/SRR8365305_1.fastq.gz
out=raw/SRR8365305_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365305/SRR8365305_2.fastq.gz
out=raw/SRR8365305_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365304/SRR8365304_1.fastq.gz
out=raw/SRR8365304_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365304/SRR8365304_2.fastq.gz
out=raw/SRR8365304_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365260/SRR8365260_1.fastq.gz
out=raw/SRR8365260_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365260/SRR8365260_2.fastq.gz
out=raw/SRR8365260_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365303/SRR8365303_1.fastq.gz
out=raw/SRR8365303_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365303/SRR8365303_2.fastq.gz
out=raw/SRR8365303_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365261/SRR8365261_1.fastq.gz
out=raw/SRR8365261_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365261/SRR8365261_2.fastq.gz
out=raw/SRR8365261_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365262/SRR8365262_1.fastq.gz
out=raw/SRR8365262_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365262/SRR8365262_2.fastq.gz
out=raw/SRR8365262_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365302/SRR8365302_1.fastq.gz
out=raw/SRR8365302_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365302/SRR8365302_2.fastq.gz
out=raw/SRR8365302_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365301/SRR8365301_1.fastq.gz
out=raw/SRR8365301_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365301/SRR8365301_2.fastq.gz
out=raw/SRR8365301_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365263/SRR8365263_1.fastq.gz
out=raw/SRR8365263_HIP2_male_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365263/SRR8365263_2.fastq.gz
out=raw/SRR8365263_HIP2_male_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365264/SRR8365264_1.fastq.gz
out=raw/SRR8365264_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365264/SRR8365264_2.fastq.gz
out=raw/SRR8365264_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365267/SRR8365267_1.fastq.gz
out=raw/SRR8365267_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/007/SRR8365267/SRR8365267_2.fastq.gz
out=raw/SRR8365267_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365274/SRR8365274_1.fastq.gz
out=raw/SRR8365274_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/004/SRR8365274/SRR8365274_2.fastq.gz
out=raw/SRR8365274_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365269/SRR8365269_1.fastq.gz
out=raw/SRR8365269_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/009/SRR8365269/SRR8365269_2.fastq.gz
out=raw/SRR8365269_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365271/SRR8365271_1.fastq.gz
out=raw/SRR8365271_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/001/SRR8365271/SRR8365271_2.fastq.gz
out=raw/SRR8365271_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365273/SRR8365273_1.fastq.gz
out=raw/SRR8365273_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/003/SRR8365273/SRR8365273_2.fastq.gz
out=raw/SRR8365273_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365265/SRR8365265_1.fastq.gz
out=raw/SRR8365265_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/005/SRR8365265/SRR8365265_2.fastq.gz
out=raw/SRR8365265_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365270/SRR8365270_1.fastq.gz
out=raw/SRR8365270_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/000/SRR8365270/SRR8365270_2.fastq.gz
out=raw/SRR8365270_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365272/SRR8365272_1.fastq.gz
out=raw/SRR8365272_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/002/SRR8365272/SRR8365272_2.fastq.gz
out=raw/SRR8365272_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365266/SRR8365266_1.fastq.gz
out=raw/SRR8365266_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/006/SRR8365266/SRR8365266_2.fastq.gz
out=raw/SRR8365266_HIP1_female_R2.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365268/SRR8365268_1.fastq.gz
out=raw/SRR8365268_HIP1_female_R1.fastq.gz
ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR836/008/SRR8365268/SRR8365268_2.fastq.gz
out=raw/SRR8365268_HIP1_female_R2.fastq.gz
The project contains 103 FASTQ file pairs. For the purpose of this tutorial we assume that all fastq files are stored in fastq/
folder. The structure of sequences is shown on the picture bellow. The data was obtained using multiplex primers for V and J genes. Below you can see the structure of cDNA library.
Upstream analysis
MiXCR has a dedicated preset for this protocol, thus analysing the data ia as easy as:
mixcr analyze irepertoire-human-rna-xcr-repseq-sr \
raw/CRC016_preTherapy_R1.fastq.gz \
raw/CRC016_preTherapy_R2.fastq.gz \
results/CRC016_preTherapy
One might also use GNU Parallel to process all samples at once:
#!/usr/bin/env bash
mkdir -p results
ls /raw/*R1* |
parallel -j 2 --line-buffer \
"mixcr analyze irepertoire-human-rna-xcr-repseq-sr \
{} \
{=s:R1:R2:=} \
{=s:.*/:results/:;s:_R.*::=}"
Under the hood pipeline:
Under the hood irepertoire-human-rna-xcr-repseq-sr
executes the following pipeline:
align
Alignment of raw sequencing reads against reference database of V-, D-, J- and C- gene segments.
mixcr align \
--species hsa \
-p default_4.0 \
-OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP" \
-OvParameters.parameters.floatingLeftBound=true \
-OjParameters.parameters.floatingRightBound=false \
-OcParameters.parameters.floatingRightBound=true \
--report results/CRC016_preTherapy.report.txt \
--json-report results/CRC016_preTherapy.report.json \
raw/CRC016_preTherapy_R1.fastq.gz \
raw/CRC016_preTherapy_R2.fastq.gz \
results/CRC016_preTherapy.vdjca
Option --report
is specified here explicitly.
--species hsa
- determines the organism species.
-p
generic-amplicon
a preset of MiXCR parameters for amplicon data .-OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP"
- Sets a V gene feature to align. Check gene features for more info.
-OvParameters.parameters.floatingLeftBound=true
- Results in a local alignment algorithm for V gene left bound due to the presence of primer sequences in V-gene region.
-OjParameters.parameters.floatingRightBound=false -OcParameters.parameters.floatingRightBound=true
- Results in a global alignment algorithm for J gene right bound and a local alignment algorithm for C-gene right bound due to the presence of primer sequences.
assemble
Assembles alignments into clonotypes and applies several layers of errors correction(ex. quality-awared correction for sequencing errors, clustering to correct for PCR errors). Check mixcr assemble
for more information. By default, clones will be assembled by CDR3
gene feature.
-OseparateByJ=true
- Split clones with the same
CDR3
sequence and different J-genes
mixcr assemble \
-OassemblingFeatures="CDR3" \
-OseparateByJ=true \
--report results/CRC016_preTherapy.report.txt \
--json-report results/CRC016_preTherapy.report.json \
results/CRC016_preTherapy.vdjca \
results/CRC016_preTherapy.clns
export
Exports clonotypes from .clns file into human-readable tables.
mixcr exportClones \
-c IGH \
results/CRC016_preTherapy.clns \
results/CRC016_preTherapy.clonotypes.TRA.tsv
mixcr exportClones \
-c IGL \
results/CRC016_preTherapy.clns \
results/CRC016_preTherapy.clonotypes.TRB.tsv
-с <chain>
- defines a specific chain to be exported.
After execution is complete the following list of files is generated for every sample:
# human-readable reports
CRC016_preTherapy.report
# raw alignments (highly compressed binary file)
CRC016_preTherapy.vdjca
# TRA, TRB CDR3 clonotypes (highly compressed binary file)
CRC016_preTherapy.clns
# TRA, TRB CDR3 clonotypes exported in tab-delimited txt
CRC016_preTherapy.TRA.tsv
CRC016_preTherapy.TRB.tsv
While .clns
file holds all data and is used for downstream analysis using mixcr postanalisis
, the output .tsv
clonotype table will contain exhaustive information about each clonotype as well:
See first 100 records from clonotype table CRC016_preTherapy:
cloneId | cloneCount | cloneFraction | targetSequences | targetQualities | allVHitsWithScore | allDHitsWithScore | allJHitsWithScore | allCHitsWithScore | allVAlignments | allDAlignments | allJAlignments | allCAlignments | nSeqFR1 | minQualFR1 | nSeqCDR1 | minQualCDR1 | nSeqFR2 | minQualFR2 | nSeqCDR2 | minQualCDR2 | nSeqFR3 | minQualFR3 | nSeqCDR3 | minQualCDR3 | nSeqFR4 | minQualFR4 | aaSeqFR1 | aaSeqCDR1 | aaSeqFR2 | aaSeqCDR2 | aaSeqFR3 | aaSeqCDR3 | aaSeqFR4 | refPoints |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32290 | 0.0900717 | TGTGCCAGCAGCACCTGGACAGGGAGTGGGGATGAGCAGTTCTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV7-9*00(1141.7) | TRBD1*00(41) | TRBJ2-1*00(209.2) | TRBC2*00(258.2) | 273 | 285 | 310 | 0 | 12 | 60.0 | 10 | 21 | 36 | 13 | 24 | SG12T | 41.0 | 28 | 42 | 70 | 31 | 45 | 70.0 | nan | nan | nan | nan | nan | ||
1 | 25443 | 0.0709723 | TGTGCCAGCAGCCCCTTTGAGGGACAGGGGCGCTTCGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGGHGGHHHHHHHHHHHG | TRBV3-1*00(1156.4) | TRBD1*00(50) | TRBJ2-7*00(205.2) | TRBC2*00(258.1) | 270 | 283 | 307 | 0 | 13 | 65.0 | 12 | 22 | 36 | 20 | 30 | 50.0 | 23 | 39 | 67 | 32 | 48 | SA25T | 66.0 | nan | nan | nan | nan | nan | ||
2 | 9071 | 0.0253032 | TGTGCCAGCAGTTACTCGAACCCGGACAGTGTTGGGCCCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHGGGGGHHHHHGGGHGGGHGGGGGGHHHHHHHHHHHG | TRBV6-5*00(1207.6) | TRBD1*00(33) | TRBJ2-7*00(224.2) | TRBC2*00(258.1) | 270 | 289 | 307 | 0 | 19 | 95.0 | 9 | 19 | 36 | 20 | 29 | DG12 | 33.0 | 22 | 39 | 67 | 37 | 54 | 85.0 | nan | nan | nan | nan | nan | ||
3 | 7841 | 0.0218722 | TGTGCCTGGACAAAACCGGGCCAGGGTATCGCTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGGHHHHHHHHGGHGGHHHHHHHHHHHHHHH | TRBV30*00(1146.3) | TRBD1*00(41) | TRBJ1-1*00(209.1) | TRBC1*00(257.8) | 270 | 280 | 304 | 0 | 10 | 50.0 | 10 | 21 | 36 | 15 | 26 | SA15C | 41.0 | 26 | 40 | 68 | 31 | 45 | 70.0 | nan | nan | nan | nan | nan | ||
4 | 6356 | 0.0177298 | TGCAGTGCCCGGGGGGGCCTCCATAGCAATCAGCCCCAGCATTTT | HHHHGGGGGGGGGGGGGGHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV20-1*00(1182.6) | TRBD2*00(39) | TRBJ1-5*00(254.2) | TRBC1*00(258) | 279 | 287 | 313 | 0 | 8 | 40.0 | 24 | 38 | 48 | 9 | 22 | DA28SC32G | 39.0 | 19 | 42 | 70 | 22 | 45 | 115.0 | nan | nan | nan | nan | nan | ||
5 | 4823 | 0.0134536 | TGTGCCTGGAGACGGAGCACAGATACGCAGTATTTT | HHHHHHHHHHHHGHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1160.5) | nan | TRBJ2-3*00(244.2) | TRBC2*00(257.8) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 20 | 41 | 69 | 15 | 36 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
6 | 4392 | 0.0122513 | TGTGCCTCGGGGACCTACGAGCAGTACTTC | HHGGHHHGHHHHHGGGGGHHHHHHHHHHHG | TRBV7-6*00(1117.5) | nan | TRBJ2-7*00(224.1) | TRBC2*00(258) | 273 | 279 | 310 | 0 | 6 | 30.0 | nan | 22 | 39 | 67 | 13 | 30 | 85.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
7 | 3785 | 0.0105581 | TGTGCCAGCAGCTTCTCGGATAGCAGAGAGACCCAGTACTTC | HHHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-4*00(1159.7) | nan | TRBJ2-5*00(220.1) | TRBC2*00(257.9) | 270 | 284 | 306 | 0 | 14 | 70.0 | nan | 13 | 40 | 68 | 13 | 42 | ST16CI19ASC21GI24G | 83.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
8 | 3443 | 0.00960412 | TGTGCCAGCAGCGTAACCGGGACTTACCCCCCAGATACGCAGTATTTT | HHHHHHHGGGGGGGGGGGGHHHHHGHGGGHHHHHGGGGHHHHHHHHHH | TRBV9*00(1160.3) | TRBD200(40),TRBD100(35) | TRBJ2-3*00(224.1) | TRBC2*00(258.1) | 270 | 285 | 306 | 0 | 15 | 75.0 | 14 | 22 | 48 | 16 | 24 | 40.0;10 | 17 | 36 | 16 | 23 | 35.0 | 24 | 41 | 69 | 31 | 48 | ||||
9 | 2785 | 0.00776865 | TGCGCCAGCAGCTTGGAACAGGGGGCGCGGACTGAAAAACTGTTTTTT | HHHGHHHHHHGHHHHHHHGGGGGGGGGGHHHHHHGHHHHHHGGHHHHH | TRBV5-1*00(1209.1) | TRBD1*00(55) | TRBJ1-4*00(219.1) | TRBC1*00(258.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | 15 | 26 | 36 | 17 | 28 | 55.0 | 27 | 43 | 71 | 32 | 48 | 80.0 | nan | nan | nan | nan | nan | |||
10 | 2619 | 0.0073056 | TGCGCCAGCAATGAGTGGGGGGTCGGCACTGAAGCTTTCTTT | HHHGHHHHHHHHHHHHHGHGGGGGHHHHHHHHHHHHHHHHHH | TRBV10-2*00(1183.2) | TRBD1*00(25) | TRBJ1-1*00(219.1) | TRBC1*00(257.8) | 270 | 286 | 307 | 0 | 16 | SG280A | 66.0 | 18 | 23 | 36 | 16 | 21 | 25.0 | 24 | 40 | 68 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | ||
11 | 2355 | 0.00656918 | TGTGCCAGCTCACCGGGTCGTGGAACTGAAGCTTTCTTT | HHHHHHHHHGGGGGGGGGGHHHHHHHHHHHHHHHHHHHH | TRBV18*00(1172.8) | nan | TRBJ1-1*00(214.2) | TRBC1*00(258) | 273 | 287 | 310 | 0 | 14 | 70.0 | nan | 25 | 40 | 68 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
12 | 2312 | 0.00644924 | TGTGCCAGCTCTCAGCAAGCCCAGACCGGGGAGCTGTTTTTT | HHHHHHHHHHHHHHHGHHHHHHGGGGGGGHHHHHHGGHHHHH | TRBV28*00(1129.2) | nan | TRBJ2-2*00(229.1) | TRBC2*00(258.2) | 270 | 279 | 307 | 0 | 9 | 45.0 | nan | 25 | 43 | 71 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
13 | 1965 | 0.00548129 | TGTGCCAGCAGCGGACAGAGAACTATGAACACTGAAGCTTTCTTT | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV9*00(1151.4) | TRBD1*00(25) | TRBJ1-1*00(244.2) | TRBC1*00(257.4) | 270 | 283 | 306 | 0 | 13 | 65.0 | 14 | 19 | 36 | 13 | 18 | 25.0 | 19 | 40 | 68 | 24 | 45 | 105.0 | nan | nan | nan | nan | nan | |||
14 | 1902 | 0.00530556 | TGTGCCAGTTGGGGAGGCGAGCAGTACTTC | HHHHHHHGGHHGHGGGGGHHHHHHHGHHHG | TRBV6-5*00(1159.5) | TRBD2*00(30) | TRBJ2-7*00(204.1) | TRBC2*00(258.2) | 270 | 278 | 307 | 0 | 8 | 40.0 | 25 | 31 | 48 | 11 | 17 | 30.0 | 26 | 39 | 67 | 17 | 30 | 65.0 | nan | nan | nan | nan | nan | |||
15 | 1578 | 0.00440177 | TGTGCCAGCAGCCATCGGGACAGAAACTACGAGCAGTACTTC | HHHHHHHGGHHGGGGGHHHHHHHHHGGGGGHHHHHHHHHHHG | TRBV7-9*00(1137.3) | TRBD1*00(40) | TRBJ2-7*00(219.3) | TRBC2*00(258) | 273 | 285 | 310 | 0 | 12 | 60.0 | 11 | 19 | 36 | 15 | 23 | 40.0 | 23 | 39 | 67 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | |||
16 | 1003 | 0.00279783 | TGTGCCAGCAGCTTAGGGACAGATACGCAGTATTTT | HHHHHHHHHHHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV13*00(1188) | nan | TRBJ2-3*00(229.2) | TRBC2*00(258.2) | 270 | 287 | 307 | 0 | 17 | 85.0 | nan | 23 | 41 | 69 | 18 | 36 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
17 | 992 | 0.00276715 | TGTGCCTGGAGTGACAGGGTAGAGACCCAGTACTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHG | TRBV30*00(1168.5) | nan | TRBJ2-5*00(219.3) | TRBC2*00(258.1) | 270 | 287 | 304 | 0 | 16 | DT283 | 68.0 | nan | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
18 | 962 | 0.00268346 | TGTGCCTGGATGGAGTCTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1154.9) | nan | TRBJ1-1*00(208.8) | TRBC1*00(257.7) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 40 | 68 | 16 | 30 | 70.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
19 | 897 | 0.00250215 | TGTGCCAGCTCACCACAGGGGGGGACTGAAGCTTTCTTT | HHHGHHHHHHGHHHHGGGGGHHHHHHHHHHHHHHHHHHH | TRBV18*00(1179.8) | nan | TRBJ1-1*00(208.1) | TRBC1*00(258.1) | 273 | 302 | 310 | 0 | 28 | SC289AST292GST295GDG298 | 86.0 | nan | 25 | 40 | 68 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
20 | 888 | 0.00247704 | TGCGCCAGCAAGACACAGGGCTCGAAGGCTGAAGCTTTCTTT | HHHGHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHHHHHHHHH | TRBV5-1*00(1181.4) | TRBD1*00(30) | TRBJ1-1*00(208.5) | TRBC1*00(257.6) | 270 | 280 | 306 | 0 | 10 | 50.0 | 15 | 21 | 36 | 14 | 20 | 30.0 | 26 | 40 | 68 | 28 | 42 | 70.0 | nan | nan | nan | nan | nan | |||
21 | 877 | 0.00244636 | TGTGCCTGGAGTGATCGGGTGGAGACCCAGTACTTC | HHHHHHHHHHHGGGGGGHHHHHHHHHHHHHHHHHHG | TRBV30*00(1169.1) | nan | TRBJ2-5*00(213.9) | TRBC2*00(258.1) | 270 | 283 | 304 | 0 | 13 | 65.0 | nan | 25 | 40 | 68 | 21 | 36 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
22 | 734 | 0.00204747 | TGCAGCGTTGTGGACAGGGGAGAGACCCAGTACTTC | HHHHHHGHHHHHHHHHGHHHHHHHHHHHHHHHHHHH | TRBV29-1*00(1210) | TRBD1*00(45) | TRBJ2-5*00(219.2) | TRBC2*00(257.8) | 276 | 286 | 310 | 0 | 10 | 50.0 | 13 | 22 | 36 | 11 | 20 | 45.0 | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | |||
23 | 693 | 0.0019331 | TGTGCCAGTAGTATAAGCTCGAACGGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHGHHGHHHGHHHHHHHHHHHHHHHHHHHHH | TRBV19*00(1159.9) | nan | TRBJ1-1*00(233.8) | TRBC1*00(258) | 270 | 285 | 307 | 0 | 15 | 75.0 | nan | 21 | 40 | 68 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
24 | 692 | 0.00193031 | TGTGCCTGGAGTGTACTAGGGGGTAGTCAGCCCCAGCATTTT | HHHHHHHHHHHHHHHHGGGHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1187.8) | TRBD1*00(30) | TRBJ1-5*00(218.5) | TRBC1*00(258) | 270 | 286 | 304 | 0 | 16 | 80.0 | 17 | 23 | 36 | 17 | 23 | 30.0 | 26 | 42 | 70 | 26 | 42 | 80.0 | nan | nan | nan | nan | nan | |||
25 | 678 | 0.00189126 | TGTGCCTGGACCAAGGGACTAGCGGGGGTCAATGAGCAGTTCTTC | HHHHHHHHHHHHHHHHHHGGGGGGGHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1154) | TRBD2*00(60) | TRBJ2-1*00(219.1) | TRBC2*00(256.8) | 270 | 280 | 304 | 0 | 10 | 50.0 | 16 | 28 | 48 | 14 | 26 | 60.0 | 26 | 42 | 70 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | |||
26 | 667 | 0.00186057 | TGTGCCTGGAGTGTGGGGGCCAGGCCATATAGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGGGGGGHHHHGHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1167.5) | TRBD1*00(30) | TRBJ1-5*00(259.1) | TRBC1*00(257.3) | 270 | 284 | 304 | 0 | 14 | 70.0 | 18 | 24 | 36 | 14 | 20 | 30.0 | 18 | 42 | 70 | 27 | 51 | 120.0 | nan | nan | nan | nan | nan | |||
27 | 649 | 0.00181036 | TGTGCCAGCAGCCAAGACGGGGGATCCAGCTCCTACGAGCAGTACTTC | HHHHHHHHGHHHHHHHHHGGHHHHHHHHHHHGGGGGHHHHHHHGGHGG | TRBV3-1*00(1172.3) | TRBD1*00(25) | TRBJ2-7*00(244.1) | TRBC2*00(257.8) | 270 | 287 | 307 | 0 | 17 | 85.0 | 18 | 23 | 36 | 18 | 23 | 25.0 | 18 | 39 | 67 | 27 | 48 | 105.0 | nan | nan | nan | nan | nan | |||
28 | 638 | 0.00177968 | TGCAGCGTCTGGGACAGGGAGGCCTACACCTTC | HHHHGGGHHHHHHHHHGHHHHHHHHHHHHHHHH | TRBV29-1*00(1214.7) | TRBD1*00(48) | TRBJ1-2*00(189) | TRBC1*00(258.1) | 276 | 284 | 310 | 0 | 8 | 40.0 | 12 | 24 | 36 | 10 | 23 | I21A | 48.0 | 30 | 40 | 68 | 23 | 33 | 50.0 | nan | nan | nan | nan | nan | ||
29 | 626 | 0.0017462 | TGTGCCTGGAGTGGGGGGGGCGCGGCGCGGGGCACTGAAGCTTTCTTT | HHHHHHHHHHGGGGGGGGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1170.2) | TRBD1*00(40) | TRBJ1-1*00(219) | TRBC1*00(257.7) | 270 | 283 | 304 | 0 | 13 | 65.0 | 18 | 26 | 36 | 15 | 23 | 40.0 | 24 | 40 | 68 | 32 | 48 | 80.0 | nan | nan | nan | nan | nan | |||
30 | 622 | 0.00173505 | TGCGCCAGCAGTATAGAAGACGCCCGTAATGAAAAACTGTTTTTT | HGGGHHHHHHHHHHHHGGGGGGGGGHHHHHHGHHHHHHGGHHHHH | TRBV5-1*00(1189.2) | TRBD1*00(25) | TRBJ1-4*00(234) | TRBC1*00(257.8) | 270 | 281 | 306 | 0 | 11 | 55.0 | 23 | 28 | 36 | 20 | 25 | 25.0 | 24 | 43 | 71 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | |||
31 | 599 | 0.00167089 | TGTGCCAGCAGCGAGATCGGGGGTCCGAGCTCCTACGAGCAGTACTTC | HHHHHHHGGGGHHHGGGGGGGGGGGGHHHHHGGGGGHHHHHHHGHHHG | TRBV5-4*00(1144.3) | TRBD1*00(25) | TRBJ2-7*00(249.4) | TRBC2*00(258.1) | 270 | 282 | 306 | 0 | 12 | 60.0 | 18 | 23 | 36 | 18 | 23 | 25.0 | 17 | 39 | 67 | 26 | 48 | 110.0 | nan | nan | nan | nan | nan | |||
32 | 594 | 0.00165694 | TGTGCCTGGACTGGTGGGGCTAGCACAGATACGCAGTATTTT | HHHHHHHHHHHHGGGGHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1157) | nan | TRBJ2-3*00(259.3) | TRBC2*00(258.2) | 270 | 283 | 304 | 0 | 13 | SG280C | 51.0 | nan | 17 | 41 | 69 | 18 | 42 | 120.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
33 | 552 | 0.00153978 | TGTGCCTGGAGTGTCCAGGGCGTCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHGGGGGHGGGGGHHHHHHHGHHHH | TRBV30*00(1162.2) | TRBD1*00(25) | TRBJ2-7*00(219.2) | TRBC2*00(258.3) | 270 | 287 | 304 | 0 | 17 | SA284C | 71.0 | 20 | 25 | 36 | 17 | 22 | 25.0 | 23 | 39 | 67 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | ||
34 | 520 | 0.00145052 | TGTGCCAGCAGCCACCTTGAAACAACCCAAGAGACCCAGTACTTC | HHHHHHHGGHHGHHHHHHHHHGGGGHHHHHHHHHHHHHHHGHHHG | TRBV7-8*00(1138.4) | nan | TRBJ2-5*00(234.1) | TRBC2*00(257.6) | 273 | 285 | 310 | 0 | 12 | 60.0 | nan | 21 | 40 | 68 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
35 | 519 | 0.00144773 | TGTGCCTGGAGTGTGACAGTGATGAATCAGCCCCAGCATTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1168.3) | nan | TRBJ1-5*00(229) | TRBC1*00(256) | 270 | 287 | 304 | 0 | 18 | I284G | 73.0 | nan | 24 | 42 | 70 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
36 | 517 | 0.00144215 | TGTGCCTGGAGTACGGGACCGGCGAACACTGAAGCTTTCTTT | HHHHHHHHHHGGHGHGGGGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1163.4) | TRBD100(31),TRBD200(30) | TRBJ1-1*00(234.2) | TRBC1*00(258.6) | 270 | 282 | 304 | 0 | 12 | 60.0 | 11 | 20 | 36 | 13 | 22 | SA17C | 31.0;15 | 21 | 48 | 13 | 19 | 30.0 | 21 | 40 | 68 | 23 | 42 | |||
37 | 487 | 0.00135847 | TGTGCCTGGACGGTCCGACAGGCCCAGTACTTC | HHHHHHGGGGGGGGGGHHGGHHHHHHHHHHHHG | TRBV30*00(1150.9) | nan | TRBJ2-5*00(195.4) | TRBC2*00(258) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 40 | 68 | 19 | 33 | SA28G | 56.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
38 | 474 | 0.00132221 | TGTGCCTGGAGTCCGGACGGGACGCAAGAGACCCAGTACTTC | HHHHHHHHHGGGGGGGHGGGGGGHHHHHHHHHHHHHHHHHHG | TRBV30*00(1160.9) | TRBD100(30),TRBD200(30) | TRBJ2-5*00(228.9) | TRBC2*00(258.1) | 270 | 282 | 304 | 0 | 12 | 60.0 | 11 | 17 | 36 | 17 | 23 | 30.0;15 | 21 | 48 | 17 | 23 | 30.0 | 22 | 40 | 68 | 24 | 42 | ||||
39 | 458 | 0.00127757 | TGTGCCTGGAAGGCGGATACGCAGTATTTT | HHHHHHHHHGGGGGHHGGGGHHHHHHHHHH | TRBV30*00(1155.4) | nan | TRBJ2-3*00(214.1) | TRBC2*00(257.3) | 270 | 280 | 304 | 0 | 10 | 50.0 | nan | 26 | 41 | 69 | 15 | 30 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
40 | 456 | 0.00127199 | TGTGCCAGCGTGAACAGATACCTCAACTCTGGGGCCAACGTCCTGACTTTC | HHHGGHHHHHHHHHHHHHHHHHHHHHHHGGGHHHHHGGGHHHHHHHHHHHG | TRBV6-5*00(1148) | nan | TRBJ2-6*00(264.1) | TRBC2*00(258) | 270 | 279 | 307 | 0 | 9 | 45.0 | nan | 20 | 45 | 73 | 26 | 51 | 125.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
41 | 450 | 0.00125526 | TGTGCCTGGGTCAAGATCCAGGGGGGTACTGAAGCTTTCTTT | HHHHGHHHHHHHHHHHHHGGGGHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1149.6) | TRBD1*00(35) | TRBJ1-1*00(214) | TRBC1*00(258.6) | 270 | 279 | 304 | 0 | 9 | 45.0 | 16 | 23 | 36 | 18 | 25 | 35.0 | 25 | 40 | 68 | 27 | 42 | 75.0 | nan | nan | nan | nan | nan | |||
42 | 411 | 0.00114647 | TGTGCCAGCAGCCAAGAGGTAAGGGGTCGGGGCCACAATCAGCCCCAGCATTTT | HHHHHHHHGHHHHHHHHHHHHGGGGGGGGHHHGHHHHHHGGGGGHHHHHHHHHH | TRBV4-2*00(1136.5) | TRBD100(26),TRBD200(25) | TRBJ1-5*00(234.5) | TRBC1*00(257) | 270 | 287 | 307 | 0 | 17 | 85.0 | 17 | 25 | 36 | 21 | 29 | SG22T | 26.0;29 | 34 | 48 | 29 | 34 | 25.0 | 23 | 42 | 70 | 35 | 54 | |||
43 | 407 | 0.00113531 | TGTGCCAGCAGTTACGCAGAATCCTACGAGCAGTACTTC | HHHHHHHHHHGGGGHHHHHHHHGGGGGHHHHHHHGHHHG | TRBV6-5*00(1190.3) | nan | TRBJ2-7*00(226.3) | TRBC2*00(257.7) | 270 | 285 | 307 | 0 | 15 | 75.0 | nan | 21 | 39 | 67 | 21 | 39 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
44 | 404 | 0.00112694 | TGCAGTGCTAGGAACCGGGACAGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV20-1*00(1199.4) | TRBD100(30),TRBD200(30) | TRBJ1-1*00(234.6) | TRBC1*00(256.9) | 279 | 290 | 313 | 0 | 11 | 55.0 | 10 | 16 | 36 | 14 | 20 | 30.0;14 | 20 | 48 | 14 | 20 | 30.0 | 18 | 40 | 68 | 20 | 42 | ST20G | |||
45 | 402 | 0.00112136 | TGTGCCTGGAGCCGCGGGACAGAAACAGATACGCAGTATTTT | HHHHHHHGGGGGGGGHHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1156.6) | TRBD1*00(40) | TRBJ2-3*00(228.7) | TRBC2*00(258.1) | 270 | 281 | 304 | 0 | 11 | 55.0 | 11 | 19 | 36 | 14 | 22 | 40.0 | 23 | 41 | 69 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | |||
46 | 401 | 0.00111857 | TGTGCCAGCAGCTCAGGAGAGACCCAGTACTTC | HHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHG | TRBV7-6*00(1148.8) | nan | TRBJ2-5*00(219.4) | TRBC2*00(258.7) | 273 | 289 | 310 | 0 | 16 | ST286C | 66.0 | nan | 24 | 40 | 68 | 17 | 33 | 80.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
47 | 386 | 0.00107673 | TGTGCCAGCAGCTTGGGAGGGAACACTGAAGCTTTCTTT | HHHHHHHHHHGHHHGGHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-8*00(1160.7) | nan | TRBJ1-1*00(233.6) | TRBC1*00(257.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | nan | 21 | 40 | 68 | 20 | 39 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
48 | 382 | 0.00106557 | TGTGCCAGCAGCTCCGGGGGGGCTAAGGAGCAGTTCTTC | HHHHHHHHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHG | TRBV9*00(1143.1) | TRBD200(31),TRBD100(30) | TRBJ2-1*00(199.7) | TRBC2*00(256.5) | 270 | 282 | 306 | 0 | 12 | 60.0 | 24 | 33 | 48 | 14 | 23 | SA28G | 31.0;18 | 24 | 36 | 17 | 23 | 30.0 | 27 | 42 | 70 | 24 | 39 | ST29G | ||
49 | 376 | 0.00104884 | TGTGCCTGGGTGCAAAAAGACGCTAGCACAGATACGCAGTATTTT | HHHHGHGGHHHHGHHHGGGGGHHHHHHHHHHGGGGHHHHHHHHHG | TRBV30*00(1154) | nan | TRBJ2-3*00(259.2) | TRBC2*00(257.5) | 270 | 283 | 304 | 0 | 12 | DA279 | 48.0 | nan | 17 | 41 | 69 | 21 | 45 | 120.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
50 | 374 | 0.00104326 | TGTGCCAGCAGCTCGAAGCGTGGGACAAACTACGAGCAGTACTTC | HHHHHHHHHGGHHGGGGGGHHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV11-2*00(1147.1) | TRBD1*00(30) | TRBJ2-7*00(219.3) | TRBC2*00(257.3) | 273 | 286 | 310 | 0 | 13 | 65.0 | 12 | 18 | 36 | 21 | 27 | 30.0 | 23 | 39 | 67 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | |||
51 | 368 | 0.00102652 | TGTGGCGCGGGCTCCTACAATGAGCAGTTCTTC | HGGGGGGGGGHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1121.8) | nan | TRBJ2-1*00(254.1) | TRBC2*00(258.3) | 270 | 274 | 304 | 0 | 4 | 20.0 | nan | 19 | 42 | 70 | 10 | 33 | 115.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
52 | 364 | 0.00101536 | TGTGCCAGCAGCCCTCGCCGGTCCATGAACACTGAAGCTTTCTTT | HHHHHHHGGHGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV5-4*00(1149.6) | nan | TRBJ1-1*00(249.7) | TRBC1*00(255.8) | 270 | 282 | 306 | 0 | 12 | 60.0 | nan | 15 | 40 | 68 | 20 | 45 | ST17C | 111.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
53 | 352 | 0.000981891 | TGCAGCGTTAAAACAGCCGCGAAAAGGTACGAGCAGTACTTC | HGHHHHHHHHHHHHHGHHGGHHHHHGHGGHHHHHHHHGGGGG | TRBV29-1*00(1212.6) | nan | TRBJ2-7*00(214.2) | TRBC2*00(258.1) | 276 | 288 | 310 | 0 | 12 | SG285A | 46.0 | nan | 24 | 39 | 67 | 27 | 42 | 75.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
54 | 347 | 0.000967943 | TGTGCCAGCAGCTTAGTGATAGGGAACACCGGGGAGCTGTTTTTT | HHHGHHHHHHHHHHHHHHHHHHHHGGGGGGGGHHHHHHGGHHHHH | TRBV7-7*00(1144.4) | nan | TRBJ2-2*00(249.4) | TRBC2*00(257.7) | 273 | 289 | 310 | 0 | 16 | 80.0 | nan | 21 | 43 | 71 | 23 | 45 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
55 | 345 | 0.000962365 | TGTGCCTGGAGAAATGGAGGGGCTCCCGGTTTTGCCTACGAGCAGTACTTC | HHHHHHHHHHHHHHHGGHGGGGGGGGGGHHHHHHGGGGGHHHHHHHGGHGG | TRBV30*00(1161.3) | TRBD2*00(39) | TRBJ2-7*00(224.5) | TRBC2*00(257.7) | 270 | 281 | 304 | 0 | 11 | 55.0 | 26 | 40 | 48 | 15 | 28 | DC32SC33G | 39.0 | 22 | 39 | 67 | 34 | 51 | 85.0 | nan | nan | nan | nan | nan | ||
56 | 335 | 0.00093447 | TGTGCCAGCAGCGTAGATTGGGCCGTCGGCAATGAGCAGTTCTTC | HHHHHHHGGGGGHHHHHHGGGGGGGGGHHHHHHHHHHHHHGGHGG | TRBV9*00(1168) | TRBD2*00(26) | TRBJ2-1*00(219.4) | TRBC2*00(256.2) | 270 | 286 | 306 | 0 | 16 | 80.0 | 29 | 37 | 48 | 19 | 27 | SC34G | 26.0 | 26 | 42 | 70 | 29 | 45 | 80.0 | nan | nan | nan | nan | nan | ||
57 | 334 | 0.00093168 | TGTGCCAGCAGCGTAGGCTACAGCCAAGAGACCCAGTACTTC | HHHHHHHGGGGGHHHHHHHHGHHHHHHHHHHHHHHHHHHHHH | TRBV9*00(1178.4) | nan | TRBJ2-5*00(234.3) | TRBC2*00(258.4) | 270 | 290 | 306 | 0 | 21 | I285G | 88.0 | nan | 21 | 40 | 68 | 23 | 42 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
58 | 332 | 0.000926102 | TGCAGCGTTGACGAGGGGGCAGGACAGCCCCAGCATTTT | HHGHGGHGHGHGGGGGHHHHHHHHGGGGGHHHHHHHHHH | TRBV29-1*00(1233.3) | TRBD1*00(30) | TRBJ1-5*00(214.2) | TRBC1*00(257.8) | 276 | 290 | 310 | 0 | 14 | SA287C | 56.0 | 18 | 24 | 36 | 14 | 20 | 30.0 | 27 | 42 | 70 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | ||
59 | 329 | 0.000917733 | TGTGCCAGCAGCTTAGAGAGGGGGACAGCTAACTATGGCTACACCTTC | HHHGHHHHHHHHHHHHHGGGHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV7-6*00(1157.9) | TRBD1*00(30) | TRBJ1-2*00(249.2) | TRBC1*00(257.4) | 273 | 289 | 310 | 0 | 16 | 80.0 | 17 | 23 | 36 | 18 | 24 | 30.0 | 18 | 40 | 68 | 26 | 48 | 110.0 | nan | nan | nan | nan | nan | |||
60 | 327 | 0.000912154 | TGTGCCAGCAGCGGAAACAGGGGCCGCACAGATACGCAGTATTTT | HHHHHHHGGGGGHHHHHGHGGGGGGHHHHHHGGGGHHHHHHHHHH | TRBV5-4*00(1157.4) | TRBD1*00(35) | TRBJ2-3*00(239.2) | TRBC2*00(258.1) | 270 | 282 | 306 | 0 | 12 | 60.0 | 15 | 22 | 36 | 16 | 23 | 35.0 | 21 | 41 | 69 | 25 | 45 | 100.0 | nan | nan | nan | nan | nan | |||
61 | 320 | 0.000892628 | TGTGCCTGGAGCAAACAGGGCACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1146.4) | TRBD1*00(30) | TRBJ1-1*00(219.3) | TRBC1*00(258.5) | 270 | 281 | 304 | 0 | 11 | 55.0 | 15 | 21 | 36 | 14 | 20 | 30.0 | 24 | 40 | 68 | 20 | 36 | 80.0 | nan | nan | nan | nan | nan | |||
62 | 319 | 0.000889839 | TGTGCCTGGAGTATGGCAGGGGCAGGAAACACCATATATTTT | HHHHHHHHHHHHHHHHGHGHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1165.9) | TRBD1*00(30) | TRBJ1-3*00(228.7) | TRBC1*00(256.9) | 270 | 282 | 304 | 0 | 12 | 60.0 | 16 | 22 | 36 | 16 | 22 | 30.0 | 24 | 42 | 70 | 24 | 42 | 90.0 | nan | nan | nan | nan | nan | |||
63 | 315 | 0.000878681 | TGTGCCTGGAGTGTACAGGGTTTCACCCTCCACTTT | HHHHHHHHHHHHHHHHHGHHHHHGGGGHHHHHHHHH | TRBV30*00(1183.1) | nan | TRBJ1-6*00(188) | TRBC1*00(255.8) | 270 | 287 | 304 | 0 | 17 | 85.0 | nan | 29 | 45 | 73 | 21 | 36 | DC36 | 63.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
64 | 315 | 0.000878681 | TGTGCCTGGAGTAGGGAACAGAACACCGGGGAGCTGTTTTTT | HHHHHHHHHHHHHHHHHHHHHGGGGGGGGHHHHHHGGHHHHH | TRBV30*00(1166.4) | nan | TRBJ2-2*00(248.7) | TRBC2*00(257.8) | 270 | 282 | 304 | 0 | 12 | 60.0 | nan | 21 | 43 | 71 | 20 | 42 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
65 | 313 | 0.000873102 | TGTGCCTGGAAGTATGCCCCATTGACCCGGGCGAGGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHGGGHHHHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1155.7) | TRBD100(35),TRBD200(30) | TRBJ1-1*00(234.1) | TRBC1*00(258.3) | 270 | 280 | 304 | 0 | 10 | 50.0 | 0 | 15 | 36 | 15 | 31 | I5ASC5TST8A | 35.0;13 | 19 | 48 | 25 | 31 | 30.0 | 21 | 40 | 68 | 35 | 54 | |||
66 | 302 | 0.000842418 | TGTGCCTGGAGTGACCGGACAGGGAAGGACACTGAAGCTTTCTTT | HHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1161.5) | TRBD1*00(40) | TRBJ1-1*00(224.2) | TRBC1*00(256.4) | 270 | 283 | 304 | 0 | 13 | 65.0 | 13 | 21 | 36 | 16 | 24 | 40.0 | 23 | 40 | 68 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
67 | 296 | 0.000825681 | TGTGCCTTACAGAGGGCTAGGTACGAGCAGTACTTC | HHHHHHHHHHHHHHHHHHHGHGGGHHHHHHHGGGGG | TRBV30*00(1138.4) | TRBD100(31),TRBD200(30) | TRBJ2-7*00(214.3) | TRBC2*00(257.3) | 270 | 277 | 304 | 0 | 7 | 35.0 | 15 | 24 | 36 | 8 | 17 | SG19A | 31.0;27 | 33 | 48 | 11 | 17 | 30.0 | 24 | 39 | 67 | 21 | 36 | |||
68 | 294 | 0.000820102 | TGTGCCTGGAGTGCTGGAAACTATGGCTACACCTTC | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHH | TRBV30*00(1173.3) | nan | TRBJ1-2*00(215.1) | TRBC1*00(257.7) | 270 | 283 | 304 | 0 | 13 | 65.0 | nan | 22 | 40 | 68 | 18 | 36 | 90.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
69 | 293 | 0.000817313 | TGTGCCTGGACTGGCGGGAGCAGCACAGATACGCAGTATTTT | HHHHHHHHHHGGGGGGHHHHHHHHHHHHGGGGHHHHHHHHHH | TRBV30*00(1288.1) | TRBD2*00(35) | TRBJ2-3*00(243.7) | TRBC2*00(257.8) | 270 | 283 | 304 | 0 | 13 | SG280C | 51.0 | 23 | 30 | 48 | 13 | 20 | 35.0 | 20 | 41 | 69 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | ||
70 | 290 | 0.000808944 | TGCAGCGGGGCCGGGAGCAGGGATTACCAAGAGACCCAGTACTTC | HGGGGGGGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV29-1*00(1209.4) | TRBD1*00(43) | TRBJ2-5*00(244.1) | TRBC2*00(258.4) | 276 | 283 | 310 | 0 | 7 | 35.0 | 10 | 21 | 36 | 10 | 22 | I16G | 43.0 | 19 | 40 | 68 | 24 | 45 | 105.0 | nan | nan | nan | nan | nan | ||
71 | 282 | 0.000786628 | TGTGCCAGCAGCGTAGAGGCCTCCACAGATACGCAGTATTTT | HHHHHHHGGGGGHHHHHGGGHHHHHHHHGGGGHHHHHHHHHH | TRBV9*00(1167.2) | nan | TRBJ2-3*00(233.8) | TRBC2*00(258) | 270 | 286 | 306 | 0 | 16 | 80.0 | nan | 22 | 41 | 69 | 23 | 42 | 95.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
72 | 280 | 0.00078105 | TGCAGCGTTGAATCGCGATCTCCGGGACTTTTT | HHGHGGHHHHHGGGGHHHGGGGGHHHHGHHHHH | TRBV29-1*00(1231) | TRBD200(40),TRBD100(35) | TRBJ1-1*00(153.9) | TRBC1*00(257.3) | 276 | 288 | 310 | 0 | 12 | 60.0 | 14 | 22 | 48 | 21 | 29 | 40.0;10 | 17 | 36 | 21 | 28 | 35.0 | 37 | 40 | 68 | 30 | 33 | ||||
73 | 272 | 0.000758734 | TGTGCCAGCCAGGGGAGCCATGAGCAGTTCTTC | HHHHGGHHHGHGGGHHHHHHHHHHHHHHGGGGG | TRBV19*00(1125.3) | TRBD1*00(30) | TRBJ2-1*00(209.2) | TRBC2*00(256.8) | 270 | 278 | 307 | 0 | 8 | 40.0 | 16 | 22 | 36 | 9 | 15 | 30.0 | 28 | 42 | 70 | 19 | 33 | 70.0 | nan | nan | nan | nan | nan | |||
74 | 271 | 0.000755944 | TGCAGCGTTGGCCCCGACTTCAATCAGCCCCAGCATTTT | HGGGGGGHGGGGGGGHHHHHHHHHGGGGGHHHHHHHHHH | TRBV29-1*00(1219.3) | TRBD1*00(25) | TRBJ1-5*00(233.8) | TRBC1*00(257.2) | 276 | 286 | 310 | 0 | 10 | 50.0 | 0 | 5 | 36 | 10 | 15 | 25.0 | 23 | 42 | 70 | 20 | 39 | 95.0 | nan | nan | nan | nan | nan | |||
75 | 269 | 0.000750365 | TGTGCCTGGAGTGTGGACTCCCCGGGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGHHHHHGGGGGGGHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1182.6) | nan | TRBJ1-5*00(239.1) | TRBC1*00(256.5) | 270 | 291 | 304 | 0 | 21 | SA284GSC285G | 77.0 | nan | 22 | 42 | 70 | 25 | 45 | 100.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | |
76 | 269 | 0.000750365 | TGTGCCAGCAGCCGCGGGACTCTCCTGTTCTTC | HHHGHHHGGGGGGGGHHHHGHGHHHHHHHGGGG | TRBV7-6*00(1139) | TRBD200(35),TRBD100(30) | TRBJ2-1*00(174.4) | TRBC2*00(258.8) | 273 | 285 | 310 | 0 | 12 | 60.0 | 15 | 22 | 48 | 14 | 21 | 35.0;11 | 17 | 36 | 14 | 20 | 30.0 | 35 | 42 | 70 | 26 | 33 | ||||
77 | 267 | 0.000744786 | TGTGCCAGCAGTTACGGCGGGGGGCCCTCCTACGAGCAGTACTTC | HHHGHHHHHHGHGGHGGGGGGGGGGGHHGGGGGHHHHHHHGGGGG | TRBV6-5*00(1178.4) | TRBD2*00(38) | TRBJ2-7*00(234) | TRBC2*00(252.6) | 270 | 285 | 307 | 0 | 15 | 75.0 | 23 | 34 | 48 | 16 | 26 | DA28 | 38.0 | 20 | 39 | 67 | 26 | 45 | 95.0 | nan | nan | nan | nan | nan | ||
78 | 266 | 0.000741997 | TGTGCCTGGGCCACGGTACGGACGGGAGACACTGAAGCTTTCTTT | HHHHGHGHHGGHGGGGHHGGGHGHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1154.1) | TRBD2*00(30) | TRBJ1-1*00(224) | TRBC1*00(256.9) | 270 | 279 | 304 | 0 | 9 | 45.0 | 24 | 30 | 48 | 22 | 28 | 30.0 | 23 | 40 | 68 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
79 | 263 | 0.000733629 | TGTGCCGTGAAGGGTAGCGGGAGATACGAGCAGTACTTC | HGGGGHHHHHHHHGGGGGHHHHHHGGGHHHHHHHGGHGG | TRBV30*00(1150.1) | TRBD2*00(35) | TRBJ2-7*00(213.9) | TRBC2*00(258.1) | 270 | 285 | 304 | 0 | 16 | I276GSG278AST281G | 35.0 | 23 | 30 | 48 | 16 | 23 | 35.0 | 24 | 39 | 67 | 24 | 39 | 75.0 | nan | nan | nan | nan | nan | ||
80 | 261 | 0.00072805 | TGTGCCTGGAGTCTGCTAGCGGGAGGGAACAATGAGCAGTTCTTC | HHHHHHHHHHHHHHHGGGGGGGHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1161.7) | TRBD2*00(60) | TRBJ2-1*00(223.4) | TRBC2*00(258.3) | 270 | 282 | 304 | 0 | 12 | 60.0 | 20 | 32 | 48 | 15 | 27 | 60.0 | 25 | 42 | 70 | 28 | 45 | 85.0 | nan | nan | nan | nan | nan | |||
81 | 261 | 0.00072805 | TGTGCCTGGAGTCCCAGGATGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1157.6) | nan | TRBJ1-1*00(244.3) | TRBC1*00(258.2) | 270 | 282 | 304 | 0 | 12 | 60.0 | nan | 19 | 40 | 68 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
82 | 259 | 0.000722471 | TGTGCCTGGAGCGGGACAGGGATCAATGAGCAGTTCTTC | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHHGGHGG | TRBV30*00(1154.6) | TRBD1*00(50) | TRBJ2-1*00(218.5) | TRBC2*00(258.4) | 270 | 281 | 304 | 0 | 11 | 55.0 | 11 | 21 | 36 | 11 | 21 | 50.0 | 26 | 42 | 70 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | |||
83 | 256 | 0.000714102 | TGTGCCTGGAGTGGCGGACTGAGTAGCAATCAGCCCCAGCATTTT | HHHHHHHHHHGGGGGHHHHHHHHHHHHHHHGGGGGHHHHHHHHHH | TRBV30*00(1257.6) | TRBD2*00(25) | TRBJ1-5*00(248.2) | TRBC1*00(257.9) | 270 | 283 | 304 | 0 | 13 | 65.0 | 17 | 22 | 48 | 15 | 20 | 25.0 | 20 | 42 | 70 | 23 | 45 | 110.0 | nan | nan | nan | nan | nan | |||
84 | 255 | 0.000711313 | TGTGCCTGGAGTACCCTCGACAGGGCGAACTATGGCTACACCTTC | HHHHHHHHHHHHGGGGHGHHHGGGGGHHHHHHHHHHHHHHHHHHH | TRBV30*00(1165.5) | TRBD1*00(30) | TRBJ1-2*00(216.5) | TRBC1*00(258) | 270 | 282 | 304 | 0 | 12 | 60.0 | 14 | 20 | 36 | 18 | 24 | 30.0 | 19 | 40 | 68 | 24 | 45 | ST21G | 91.0 | nan | nan | nan | nan | nan | ||
85 | 254 | 0.000708523 | TGTGCCTGGAGGGACGCTTCTGGAAACACCATATATTTT | HHHHHHHHHHGGGGGHHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1158) | nan | TRBJ1-3*00(244.6) | TRBC1*00(258.5) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 21 | 42 | 70 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
86 | 254 | 0.000708523 | TGTGCCTGGAGACGACAGGGGGCTAACTATGGCTACACCTTC | HHHHHHHHHHGGGHHGGGGHHHHHHHHHHHHHHHHHHGGHHH | TRBV30*00(1163.9) | TRBD1*00(40) | TRBJ1-2*00(243.9) | TRBC1*00(258.1) | 270 | 281 | 304 | 0 | 11 | 55.0 | 14 | 22 | 36 | 13 | 21 | 40.0 | 19 | 40 | 68 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | |||
87 | 251 | 0.000700155 | TGTGCCAGCAGCTTGGGGGGGGGTACGGAAGAGCAGTTCTTC | HHHHHHHHHHGHGGGGGGGHHGGHHGHHHHHHHHHHHHHHHG | TRBV5-6*00(1166.5) | TRBD1*00(25) | TRBJ2-1*00(199.1) | TRBC2*00(257.6) | 270 | 286 | 306 | 0 | 16 | 80.0 | 18 | 23 | 36 | 16 | 21 | 25.0 | 30 | 42 | 70 | 30 | 42 | 60.0 | nan | nan | nan | nan | nan | |||
88 | 250 | 0.000697366 | TGTGCCAGCAGTTACTCGAAGGTTTCAGACCCCGGACCTGGAAACACCATATATTTT | HHHGHHHHHHHHHHHHHHHHHHHHHHHGGGGGGHHHHHHHHHGHHHHHHHHHHHHHH | TRBV6-5*00(1206.9) | TRBD200(27),TRBD100(26) | TRBJ1-3*00(239) | TRBC1*00(257.1) | 270 | 289 | 307 | 0 | 19 | 95.0 | 10 | 21 | 48 | 26 | 37 | ST12ASG16C | 27.0;9 | 17 | 36 | 29 | 37 | SG12C | 26.0 | 22 | 42 | 70 | 37 | 57 | ||
89 | 249 | 0.000694576 | TGTGCCTGCCCAAGAGACAGGGTCTATGGCTACACCTTC | HHHGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGG | TRBV30*00(1142.4) | TRBD1*00(35) | TRBJ1-2*00(219.1) | TRBC1*00(257.5) | 270 | 278 | 304 | 0 | 8 | 40.0 | 14 | 21 | 36 | 15 | 22 | 35.0 | 24 | 40 | 68 | 23 | 39 | 80.0 | nan | nan | nan | nan | nan | |||
90 | 245 | 0.000683418 | TGTGCCAGCAGCTTATATTACAGGGTTGGGGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHGGGGGGHHHHHHHHHHH | TRBV5-8*00(1131.2) | TRBD1*00(30) | TRBJ1-1*00(184.3) | TRBC1*00(257.8) | 270 | 284 | 306 | 0 | 14 | 70.0 | 15 | 21 | 36 | 19 | 25 | 30.0 | 31 | 40 | 68 | 30 | 39 | 45.0 | nan | nan | nan | nan | nan | |||
91 | 244 | 0.000680629 | TGTGCCTGGAGTGTATCGGCATCTGGAAACACCATATATTTT | HHHHHHHHHHHHGGGGGHHHHHHHHHHGHHHHHHHHHHHHHH | TRBV30*00(1173.4) | nan | TRBJ1-3*00(243.4) | TRBC1*00(257) | 270 | 285 | 304 | 0 | 15 | 75.0 | nan | 21 | 42 | 70 | 21 | 42 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
92 | 241 | 0.00067226 | TGTGCCAGTAGTCCGGGCATTTCCTACGAGCAGTACTTC | HHHHHHHHHGGGGGHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV19*00(1143.2) | TRBD100(25),TRBD200(25) | TRBJ2-7*00(229.4) | TRBC2*00(257.9) | 270 | 282 | 307 | 0 | 12 | 60.0 | 10 | 15 | 36 | 12 | 17 | 25.0;14 | 19 | 48 | 12 | 17 | 25.0 | 21 | 39 | 67 | 21 | 39 | ||||
93 | 241 | 0.00067226 | TGCAGCGTGCTGGACAGGGGGTTGGACAATGAGCAGTTCTTC | HGGHGGHHHHHHHHGGGGGGHHHHHHHHHHHHHHHHHGGHGG | TRBV29-1*00(1214.6) | TRBD1*00(50) | TRBJ2-1*00(223.6) | TRBC2*00(258) | 276 | 284 | 310 | 0 | 8 | 40.0 | 13 | 23 | 36 | 11 | 21 | 50.0 | 25 | 42 | 70 | 25 | 42 | 85.0 | nan | nan | nan | nan | nan | |||
94 | 235 | 0.000655524 | TGCCTTCAGAGCCACATGAACACTGAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH | TRBV20-1*00(1153.2) | nan | TRBJ1-1*00(249.5) | TRBC1*00(258.1) | 279 | 282 | 313 | 0 | 3 | 15.0 | nan | 18 | 40 | 68 | 14 | 36 | 110.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
95 | 234 | 0.000652734 | TGCAGCGTTGAAGCTTCGGCAGGGCAAAAAGCTTTCTTT | HHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHH | TRBV29-1*00(1223.9) | TRBD2*00(31) | TRBJ1-1*00(194.1) | TRBC1*00(257.7) | 276 | 289 | 310 | 0 | 13 | 65.0 | 24 | 33 | 48 | 16 | 25 | SG27C | 31.0 | 29 | 40 | 68 | 28 | 39 | 55.0 | nan | nan | nan | nan | nan | ||
96 | 228 | 0.000635997 | TGTGCCATCAGTGAGTTGGGGGGGCTCGTTTTAAAGGCTTCCGAAGCTTTCTTT | HHHHHHHHHHHHHHGGGGGGGGGGGGGHHHHHHHHHHGGGGGHHHHHHHHHHHH | TRBV10-3*00(1143.7) | TRBD100(30),TRBD200(26) | TRBJ1-1*00(198.7) | TRBC1*00(256.8) | 270 | 286 | 307 | 0 | 16 | 80.0 | 18 | 24 | 36 | 19 | 25 | 30.0;25 | 33 | 48 | 17 | 25 | SA28G | 26.0 | 28 | 40 | 68 | 42 | 54 | |||
97 | 227 | 0.000633208 | TGCAGCGTTGAGGCGGGACGCCGTTACAATGAGCAGTTCTTC | HHHHHHHHHGHHHGGGGGGGGGHHHHHHHHHHHHHHHHHHHG | TRBV29-1*00(1219.9) | TRBD100(30),TRBD200(30) | TRBJ2-1*00(229.1) | TRBC2*00(258.2) | 276 | 287 | 310 | 0 | 11 | 55.0 | 11 | 17 | 36 | 13 | 19 | 30.0;15 | 21 | 48 | 13 | 19 | 30.0 | 24 | 42 | 70 | 24 | 42 | ||||
98 | 226 | 0.000630419 | TGTGCCAGCAGCGATGACAGCTCCTACGAGCAGTACTTC | HHHHHHHHGHGHHHHHHHHHHHGGGGGHHHHHHHGGGGG | TRBV14*00(1146.8) | nan | TRBJ2-7*00(244) | TRBC2*00(258.3) | 273 | 285 | 310 | 0 | 12 | 60.0 | nan | 18 | 39 | 67 | 18 | 39 | 105.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | ||
99 | 225 | 0.000627629 | TGTGCCTGGAGCGGAAGCACTGAAGCTTTCTTT | HHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHH | TRBV30*00(1162.7) | nan | TRBJ1-1*00(222.1) | TRBC1*00(254.9) | 270 | 281 | 304 | 0 | 11 | 55.0 | nan | 21 | 40 | 68 | 13 | 33 | I24G | 83.0 | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan | nan |
Quality control
Now when we have all files processed lets perform Quality Control. The first thing to check is the alignment rate. That can be easily done using mixcr exportQc align
function.
mixcr exportQc align results/*.clns figs/alignQc.pdf
From this plot we can clearly see some issues with the libraries. A lot of the samples have a relatively big fraction of not aligned reads primarily due to the absence of J hits.
MiXCR is a powerful tool that allows us to investigate further. Let's pick one of the samples where the issue is most obvious. (ex. CRC00308 ). To look at the reads' alignments for that sample we first will run mixcr align
command for that sample once again, but this time we will specify additional options - -OallowPartialAlignments=true -OallowNoCDR3PartAlignments=true
, that will preserve partially aligned reads (ex. reads that may lack J gene) and reads that lack CDR3
sequence.
mkdir -p debug
mixcr align \
-s hsa \
-p generic-amplicon \
-OvParameters.geneFeatureToAlign="VTranscriptWithout5UTRWithP" \
-OvParameters.parameters.floatingLeftBound=true \
-OjParameters.parameters.floatingRightBound=false \
-OcParameters.parameters.floatingRightBound=true \
-OallowPartialAlignments=true \
-OallowNoCDR3PartAlignments=true \
raw/CRC003_therapy-08_R1.fastq.gz raw/CRC003_therapy-08_R2.fastq.gz \
debug/CRC003_therapy-08_debug.vdjca
Now we can look at raw alignments itself using mixcr exportAlignmentsPretty
.
The function bellow will generate a .txt
human-readable file with alignments. We use parameter --skip 1000
to skip first 1000 reads, as first reads usually have bad quality, and --limit 100
will export only 100 alignments as we usually don't need to examine every alignment to see the issue.
mixcr exportAlignmentsPretty
--skip 1000 \
--limit 100 \
debug/CRC003_therapy-08_debug.vdjca \
debug/CRC003_therapy-08_debug.alignments.txt
Bellow you can see a few alignments from the generated file. The first one is an example of well aligned reads.
>>> Read ids: 1113
FR1><CDR1 CDR1><FR2
_ E K P V T L S C S Q T L N H N V M Y W Y Q Q K S S Q
Quality 66666767777777777777777777777777777777777777777777777777777777777777777777777777
Target0 0 AGAAAAGCCAGTGACCCTGAGTTGTTCTCAGACTTTGAACCATAACGTCATGTACTGGTACCAGCAGAAGTCAAGTCAGG 79 Score
TRBV15*00 104 aaagccagtgaccctgagttgttctcagactttgaaccataacgtcatgtactggtaccagcagaagtcaagtcagg 180 1148
FR2><CDR2 CDR2><FR3
A P K L L F H Y Y D K D F N N E A D T P D N F Q S R R
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777
Target0 80 CCCCAAAGCTGCTGTTCCACTACTATGACAAAGATTTTAACAATGAAGCAGACACCCCTGATAACTTCCAATCCAGGAGG 159 Score
TRBV15*00 181 ccccaaagctgctgttccactactatgacaaagattttaacaatgaagcagacacccctgataacttccaatccaggagg 260 1148
FR3><CDR3 V
P N I S F C F L D I R S P G L G D A A M Y L C A T S G
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777
Target0 160 CCGAACATTTCTTTCTGCTTTCTTGACATCCGCTCACCAGGCCTGGGGGACGCAGCCATGTACCTGTGTGCCACCAGCGG 239 Score
TRBV15*00 261 ccgaacaCttctttctgctttcttgacatccgctcaccaggcctgggggacAcagccatgtacctgtgtgccaccagcAg 340 1148
> <J CDR3><FR4 FR4><C
L G D T Q Y F G P G T R L T V L E D L K N V F P P E
Quality 77777777777777777777777777777777777777777777777777777777777777777777777777777777
Target0 240 ACTAGGGGATACGCAGTATTTTGGCCCAGGCACCCGGCTGACAGTGCTCGAGGACCTGAAAAACGTGTTCCCACCCGAGG 319 Score
TRBV15*00 341 a 341 1148
TRBJ2-3*00 26 gatacgcagtattttggcccaggcacccggctgacagtgctcg 68 215
TRBC2*00 0 aggacctgaaaaacgtgttcccacccgagg 29 260
V A V F E P S D S Q _
Quality 7777777777777777777777777766666
Target0 320 TCGCTGTGTTTGAGCCATCAGATAGTCAATG 350 Score
TRBC2*00 30 tcgctgtgtttgagccatcaga 51 260
Now, the following pair of reads failed to align.
>>> Read ids: 1115
_ N P V G L R C Y P T S V F F C V Y L Y Q Q K P F P C
Quality 33553553353536366363363333633322336733363663633333633633733363222632222333655322
Target0 0 CAAACCCCGTGGGGCTGAGGTGCTACCCAACCTCTGTCTTTTTCTGTGTGTACTTGTACCAACAAAAACCCTTCCCCTGC 79 Score
P G S P S K N Y Q A E G G G D G E G E E G V S G G R R
Quality 62222252552522333333633336333225522622222222252522525222222222222233522525222522
Target0 80 CCCGGGTCCCCCAGTAAGAATTATCAGGCCGAAGGGGGAGGAGACGGGGAGGGGGAAGAAGGAGTATCCGGGGGGCGGCG 159 Score
FR3><CDR3
V V L K K K L N L S S L E L V D S A L Y F C A S V G
Quality 22242225222672224452252522242225242626256626244262222424525222625452225222252225
Target0 160 CGTTGTCTTAAAAAAAAAACTAAACCTGAGCTCTCTGGAGCTGGTGGACTCAGCTTTGTATTTCTGTGCCAGCGTCGGGT 239 Score
TRBV9*00 280 aactaaacctgagctctctggagctggGggactcagctttgtatttctgtgccagc--AgCgt 340 200
V><VP VP>
S H H I
Quality 242 22 252224
Target0 240 CGC-AC-CACATA 250 Score
TRBV9*00 341 AgcTacGcTGCtG 353 200
<CDR2 CDR2><FR3
Y H K G E E R A K G N I L E R F S A Q Q F P D F L F F
Quality 22224664272772777762664467626552222267767777765422666277777777766636665226665653
Target1 0 TATCATAAAGGAGAAGAGAGAGCAAAAGGAAACATTCTTGAACGATTCTCCGCACAACAGTTCCCTGACTTTCTTTTTTT 79 Score
TRBV9*00 201 tatTataaTggagaagagagagcaaaaggaaacattcttgaacgattctccgcacaacagttccctgactt 271 327
Q A E D G I R H R S R H S C * T A L P I * T P V E L
Quality 27773753566265533552552226776565636377677777752777762626237777377772777667667733
Target1 80 TCAAGCAGAAGACGGCATACGACATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATTTGAACCCCTGTGGAGCTGA 159 Score
R C N Y S S S V S V Y L F * Y P E P * P C R V P A E R
Quality 67777776777762277637766776776776667776676776636666776636366767337776636766636367
Target1 160 GGTGCAACTACTCATCGTCTGTTTCAGTGTATCTCTTCTGATATCCAGAACCCTGACCCTGCCGTGTACCAGCTGAGAGA 239 Score
L S Q _
Quality 66776666366
Target1 240 CTTAGTCAGCC 250 Score
One can use BLAST and search for the not aligned parts of sequence in order to find out its origin.
Another quality report we should investigate is a chain abundance plot.
mixcr exportQc chainUsage \
results/*.clns \
figs/chainUsage.pdf
Most of the samples are more or less equally consist of TRA and TRB chains. Although we can also see a few samples (ex. CRC013_therapy-02) with no TRA chain wich might indicate some cDNA library reparation issues and needs further investigation.
Full-length clonotype assembly
iRepertoire protocol allows to recover a broader BCR receptor sequence then just CDR3
region. According to the protocol, forward primers are located in FR1
region, thus we can safely use an assembling feature that starts from CDR1
and be sure that no primers will affect the original sequence. The reverse primers are located in FR4
region very close to CDR3
, thus there is not much left from to include in clone assembly.
Taking into account what is mentioned above, the longest possible assembling feature for this protocol is "{CDR1Begin:CDR3End}"
.
MiXCR has a specific preset to obtain full-length BCR clones with Biomed2 protocol:
mixcr analyze irepertoire-human-rna-xcr-repseq-lr \
raw/CRC016_preTherapy_R1.fastq.gz \
raw/CRC016_preTherapy_R2.fastq.gz \
results/CRC016_preTherapy
The mixcr assemble
step in this preset differs from the one above in the following manner:
mixcr assemble \
-OassemblingFeatures="{CDR1Begin:CDR3End}" \
-OseparateByJ=true \
--report results/SRR8365468_HIP2_male.report \
--json-report results/SRR8365468_HIP2_male.json \
results/CRC016_preTherapy.vdjca \
results/CRC016_preTherapy.clns
-OassemblingFeatures="{CDR1Begin:CDR3End}"
- sets the assembling feature to the region which starts from
CDR1Begin
and ends at the end ofCDR3
.
Reports
Finally, MiXCR provides a very convenient way to look at the reports generated at ech step. Every .vdjca
, .clns
and .clna
file holds all the reports for every MiXCR function that has been applied to this sample. E.g. in our case .clns
file contains reports for mixcr align
and mixcr assemble
. To output this report use mixcr exportReports
as shown bellow. Note --json
parameter will output a JSON-formatted report.
mixcr exportReports \
results/CRC016_preTherapy.clns \
figs/CRC016_preTherapy.report.txt
mixcr exportReports \
--json \
results/CRC016_preTherapy.clns \
figs/CRC016_preTherapy.report.json
Show report file
============== Align Report ==============
Input file(s): /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz,/raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz
Output file(s): results-trimmed/CRC016_preTherapy.vdjca
Version: 4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0
Command line arguments: --report results-trimmed/CRC016_preTherapy.align.report.txt --json-report results-trimmed/CRC016_preTherapy.align.report.json --preset local:irepertoire-human-tcr-lr-cdr3 /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz /raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz results-trimmed/CRC016_preTherapy.vdjca
Analysis time: 0ns
Total sequencing reads: 409090
Successfully aligned reads: 386658 (94.52%)
Paired-end alignment conflicts eliminated: 3486 (0.85%)
Alignment failed, no hits (not TCR/IG?): 4045 (0.99%)
Alignment failed because of absence of V hits: 29 (0.01%)
Alignment failed because of absence of J hits: 17953 (4.39%)
No target with both V and J alignments: 203 (0.05%)
Alignment failed because of low total score: 202 (0.05%)
Overlapped: 392724 (96%)
Overlapped and aligned: 371487 (90.81%)
Alignment-aided overlaps: 722 (0.19%)
Overlapped and not aligned: 21237 (5.19%)
No CDR3 parts alignments, percent of successfully aligned: 94 (0.02%)
Partial aligned reads, percent of successfully aligned: 575 (0.15%)
V gene chimeras: 481 (0.12%)
TRA chains: 370 (0.1%)
TRA non-functional: 57 (15.41%)
TRB chains: 386282 (99.9%)
TRB non-functional: 8075 (2.09%)
TRD chains: 5 (0%)
TRD non-functional: 0 (0%)
IGH chains: 1 (0%)
IGH non-functional: 0 (0%)
Realigned with forced non-floating bound: 34176 (8.35%)
Realigned with forced non-floating right bound in left read: 9768 (2.39%)
Realigned with forced non-floating left bound in right read: 9768 (2.39%)
============== Assemble Report ==============
Input file(s): results-trimmed/CRC016_preTherapy.vdjca
Output file(s): results-trimmed/CRC016_preTherapy.clns
Version: 4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0
Command line arguments: --report results-trimmed/CRC016_preTherapy.assemble.report.txt --json-report results-trimmed/CRC016_preTherapy.assemble.report.json results-trimmed/CRC016_preTherapy.vdjca results-trimmed/CRC016_preTherapy.clns
Analysis time: 0ns
Final clonotype count: 24390
Average number of reads per clonotype: 14.7
Reads used in clonotypes, percent of total: 358492 (87.63%)
Reads used in clonotypes before clustering, percent of total: 379951 (92.88%)
Number of reads used as a core, percent of used: 377014 (99.23%)
Mapped low quality reads, percent of used: 2937 (0.77%)
Reads clustered in PCR error correction, percent of used: 21459 (5.65%)
Reads pre-clustered due to the similar VJC-lists, percent of used: 76 (0.02%)
Reads dropped due to the lack of a clone sequence, percent of total: 2924 (0.71%)
Reads dropped due to a too short clonal sequence, percent of total: 143 (0.03%)
Reads dropped due to low quality, percent of total: 1 (0%)
Reads dropped due to failed mapping, percent of total: 3571 (0.87%)
Reads dropped with low quality clones, percent of total: 1 (0%)
Clonotypes eliminated by PCR error correction: 9372
Clonotypes dropped as low quality: 1
Clonotypes pre-clustered due to the similar VJC-lists: 69
TRB chains: 24390 (100%)
TRB non-functional: 1117 (4.58%)
{
"type": "alignerReport",
"commandLine": "--report results-trimmed/CRC016_preTherapy.align.report.txt --json-report results-trimmed/CRC016_preTherapy.align.report.json --preset local:irepertoire-human-tcr-lr-cdr3 /raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz /raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz results-trimmed/CRC016_preTherapy.vdjca",
"inputFiles": [
"/raw/iRepertoire/CRC016_preTherapy_R1.fastq.gz",
"/raw/iRepertoire/CRC016_preTherapy_R2.fastq.gz"
],
"outputFiles": [
"results-trimmed/CRC016_preTherapy.vdjca"
],
"version": "4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0",
"trimmingReport": null,
"totalReadsProcessed": 409090,
"aligned": 386658,
"notAligned": 22432,
"notAlignedReasons": {
"NoCDR3Parts": 0,
"NoBarcode": 0,
"LowTotalScore": 202,
"NoHits": 4045,
"VAndJOnDifferentTargets": 203,
"NoVHits": 29,
"NoJHits": 17953
},
"chimeras": 0,
"overlapped": 392724,
"alignmentAidedOverlaps": 722,
"overlappedAligned": 371487,
"overlappedNotAligned": 21237,
"pairedEndAlignmentConflicts": 3486,
"vChimeras": 481,
"jChimeras": 0,
"chainUsage": {
"type": "chainUsage",
"chimeras": 0,
"total": 386658,
"chains": {
"TRA": {
"total": 370,
"nonFunctional": 57,
"isOOF": 41,
"hasStops": 16
},
"TRB": {
"total": 386282,
"nonFunctional": 8075,
"isOOF": 6763,
"hasStops": 1312
},
"TRD": {
"total": 5,
"nonFunctional": 0,
"isOOF": 0,
"hasStops": 0
},
"IGH": {
"total": 1,
"nonFunctional": 0,
"isOOF": 0,
"hasStops": 0
}
}
},
"realignedWithForcedNonFloatingBound": 34176,
"realignedWithForcedNonFloatingRightBoundInLeftRead": 9768,
"realignedWithForcedNonFloatingLeftBoundInRightRead": 9768,
"noCDR3PartsAlignments": 94,
"partialAlignments": 575,
"tagReport": null
}
{
"type": "assemblerReport",
"commandLine": "--report results-trimmed/CRC016_preTherapy.assemble.report.txt --json-report results-trimmed/CRC016_preTherapy.assemble.report.json results-trimmed/CRC016_preTherapy.vdjca results-trimmed/CRC016_preTherapy.clns",
"inputFiles": [
"results-trimmed/CRC016_preTherapy.vdjca"
],
"outputFiles": [
"results-trimmed/CRC016_preTherapy.clns"
],
"version": "4.0.0-331-protocols; built=Thu Oct 06 19:33:27 CEST 2022; rev=83ef6ba9c4; lib=repseqio.v2.0",
"preCloneAssemblerReport": null,
"totalReadsProcessed": 409090,
"initialClonesCreated": 33832,
"readsDroppedNoTargetSequence": 2924,
"readsDroppedTooShortClonalSequence": 143,
"readsDroppedLowQuality": 68,
"coreReads": 377014,
"readsDroppedFailedMapping": 3571,
"lowQualityRescued": 2937,
"clonesClustered": 9372,
"readsClustered": 21459,
"clones": 24390,
"clonesDroppedAsLowQuality": 1,
"clonesPreClustered": 69,
"readsPreClustered": 76,
"readsInClones": 358492,
"readsInClonesBeforeClustering": 379951,
"readsDroppedWithLowQualityClones": 1,
"clonalChainUsage": {
"type": "chainUsage",
"chimeras": 0,
"total": 24390,
"chains": {
"TRB": {
"total": 24390,
"nonFunctional": 1117,
"isOOF": 905,
"hasStops": 212
}
}
}
}