assemble
report
MiXCR generates a comprehensive summary of assembly performance. Assemble reports may be generated right along with assemble
command using -r
/--report
for txt
report and -j/--json-report
for report in a json format, or can be exported using exportReports
command.
Show sample report
============== Assemble Report ==============
Analysis time: 0ns
Number of input groups: 22887
Number of input alignments: 401520
Number of output pre-clonotypes: 21677
Number of clonotypes per group
0: + 120 (0.57%) = 120 (0.57%)
1: + 20134 (95.85%) = 20254 (96.42%)
2: + 710 (3.38%) = 20964 (99.8%)
3: + 41 (0.2%) = 21005 (100%)
Number of core alignments: 373026 (92.9%)
Discarded core alignments: 14079 (3.77%)
Empirically assigned alignments: 1049 (0.26%)
Empirical assignment conflicts: 1 (0%)
UMI+VJ-gene empirically assigned alignments: 1050 (0.26%)
VJ-gene empirically assigned alignments: 0 (0%)
UMI empirically assigned alignments: 0 (0%)
Number of ambiguous UMIs: 751
Number of ambiguous V-genes: 404
Number of ambiguous J-genes: 64
Number of ambiguous UMI+V/J-gene combinations: 468
Unassigned alignments: 19570 (4.87%)
Final clonotype count: 2419
Average number of reads per clonotype: 152.4
Reads used in clonotypes, percent of total: 368667 (33.95%)
Reads used in clonotypes before clustering, percent of total: 369350 (34.02%)
Number of reads used as a core, percent of used: 367264 (99.44%)
Mapped low quality reads, percent of used: 2086 (0.56%)
Reads clustered in PCR error correction, percent of used: 683 (0.18%)
Reads pre-clustered due to the similar VJC-lists, percent of used: 0 (0%)
Reads dropped due to the lack of a clone sequence, percent of total: 22 (0%)
Reads dropped due to low quality, percent of total: 0 (0%)
Reads dropped due to failed mapping, percent of total: 4687 (0.43%)
Reads dropped with low quality clones, percent of total: 0 (0%)
Clonotypes eliminated by PCR error correction: 113
Clonotypes dropped as low quality: 0
Clonotypes pre-clustered due to the similar VJC-lists: 0
Clonotypes dropped in fine filtering: 0
Partially aligned reads attached to clones by tags: 0 (0%)
Partially aligned reads with ambiguous clone attachments by tags: 0 (0%)
Partially aligned reads failed to attach to clones by tags: 0 (0%)
TRB chains: 1024 (42.33%)
TRAD chains: 1370 (56.63%)
TRG chains: 25 (1.03%)
{
"type": "assemblerReport",
"commandLine": "assemble -r P15-M2-DNEG_assembleReport.txt -f P15-M2-DNEG_corrected.vdjca P15-M2-DNEG.clns",
"inputFiles": [
"P15-M2-DNEG_corrected.vdjca"
],
"outputFiles": [
"P15-M2-DNEG.clns"
],
"version": "unspecified; built=Sat Jul 09 19:09:10 CEST 2022; rev=204bb4540f; lib=repseqio.v2.0",
"preCloneAssemblerReport": {
"type": "preCloneAssemblerReport",
"inputGroups": 22887,
"inputAlignments": 401520,
"clonotypes": 21677,
"clonotypesPerGroup": {
"0": 120,
"1": 20134,
"2": 710,
"3": 41
},
"coreAlignments": 373026,
"discardedCoreAlignments": 14079,
"empiricallyAssignedAlignments": 1049,
"vjEmpiricallyAssignedAlignments": 0,
"umiEmpiricallyAssignedAlignments": 0,
"gatEmpiricallyAssignedAlignments": 1050,
"empiricalAssignmentConflicts": 1,
"unassignedAlignments": 19570,
"umiConflicts": 751,
"gatConflicts": 468,
"geneConflicts": {
"Variable": 404,
"Joining": 64
},
"coreClonotypesDroppedByTagSuffix": 0,
"coreAlignmentsDroppedByTagSuffix": 0
},
"totalReadsProcessed": 1085843,
"initialClonesCreated": 2532,
"readsDroppedNoTargetSequence": 22,
"readsDroppedLowQuality": 16,
"coreReads": 367264,
"readsDroppedFailedMapping": 4687,
"lowQualityRescued": 2086,
"clonesClustered": 113,
"readsClustered": 683,
"clones": 2419,
"clonesDroppedAsLowQuality": 0,
"clonesDroppedInFineFiltering": 0,
"clonesPreClustered": 0,
"readsPreClustered": 0,
"readsInClones": 368667,
"readsInClonesBeforeClustering": 369350,
"readsDroppedWithLowQualityClones": 0,
"clonalChainUsage": {
"type": "chainUsage",
"chimeras": 0,
"total": 2419,
"chains": {
"TRB": 1024,
"TRAD": 1370,
"TRG": 25
}
},
"readsAttachedByTags": 0,
"readsWithAmbiguousAttachmentsByTags": 0,
"readsFailedToAttachedByTags": 0
}
Pre-clone assembler report
The first part of the report is dedicated to UMI and cell barcodes based consensus assembly:
Number of input groups
- number of groups defined by unique barcodes combination. In case of single-cell UMI-barcoded library equals to unique CellId+UMI groups.
Number of input alignments
- Total number of alignments in the input
.vdjca
file. Number of output pre-clonotypes
- Total number of consensuses assembled among all groups.
Number of clonotypes per group
- number consensus assembled per number of groups.
How to read this value
Number of clonotypes per group 0: + 1209 (0.04%) = 1209 (0.04%) 1: + 2891630 (98.45%) = 2892839 (98.5%) 2~3: + 44182 (1.5%) = 2937021 (100%)
- For 1209 groups 0 consensuses were assembled due to various reasons such as bad quality, low number of reads or other conflicts.
- For 2891630 groups 1 consensus was assembled.
- For 44182 groups 2 or 3 consensuses were assembled.
Number of core alignments
- number of alignments that cover
assemblingFeature
which were used to assemble consensuses Discarded core alignments
- number of alignments that cover
assemblingFeature
but were not assigned to any consensuses Empirically assigned alignments
- Number of alignments that do not cover
assemblingFeature
but were still assigned to consensuses. Those alignments will be used bymixcr assembleContigs
if applied. Empirical assignment conflicts
- Number of conflicts encountered in empirical assignment
UMI+VJ-gene empirically assigned alignments
- Number of alignments that were assigned to consensuses based on UMI, V and J genes sequences.
VJ-gene empirically assigned alignments
- Number of alignments that were assigned to consensuses based on V and J genes sequences.
UMI empirically assigned alignments
- Number of alignments that were assigned to consensuses based on UMI sequence.
Number of ambiguous UMIs
- number of UMI conflict events.
Number of ambiguous V-genes
- number of events when two or more consensuses inside alignment group share the same V-genes, thus V-gene driven empirical assignment was not possible.
Number of ambiguous J-genes
- number of events when two or more consensuses inside alignment group share the same J-genes, thus J-gene driven empirical assignment was not possible.
Number of ambiguous UMI+V/J-gene combinations
- number of UMI+V/J-gene conflict events.
Unassigned alignments
- alignments that were not assigned to any consensuses due to the various reasons
Assembler and clustering report
The rest of the report is describes assembly regardless of barcodes:
Final clonotype count
- Number of clonotypes left after all artificial diversity error-corrections (PCR-errors and V/J/C mis-assignment correction)
Average number of reads per clonotype
- Average number of reads per final clonotype
Reads used in clonotypes, percent of total
- Total number of reads assembled into final clonotypes (percent of total number of reads). This number excludes or includes reads from clonotypes eliminated in error-correction depending on
-OaddReadsCountOnClustering
Reads used in clonotypes before clustering, percent of total
- Number of reads used in clonotypes before clustering (PCR-error correction) (percent of total number of reads)
Number of reads used as a core, percent of used
- number of core alignments with no low quality nucleotides (defined by
badQualityThreshold
). These alignments form core clonotypes. (percent of reads used in clonotypes) Mapped low quality reads, percent of used
- Number of rescued low quality reads that were aggregated by the corresponding clonotype. See mapping
Reads clustered in PCR error correction, percent of used
- Number of reads clustered in PCR error correction, percent of used. See clustering
Reads pre-clustered due to the similar VJC-lists, percent of used
- Reads pre-clustered due to the similar VJC-lists, percent of used
Reads dropped due to the lack of a clone sequence, percent of total
- Reads dropped due to the lack of a clone sequence (
assemblingFeature
), percent of total Reads dropped due to low quality, percent of total
- Reads dropped due to too many positions with low quality, percent of total
Reads dropped due to failed mapping, percent of total
- Reads dropped due to failed mapping, percent of total. See mapping
Reads dropped with low quality clones, percent of total
- Reads dropped with low quality clones, percent of total
Clonotypes eliminated by PCR error correction
- Number of clonotypes eliminated by PCR error correction
Clonotypes dropped as low quality
- Number of clonotypes dropped due to low quality after mapping, pre-clustering and clustering.
Clonotypes pre-clustered due to the similar VJC-lists
- Clonotypes pre-clustered due to the similar VJC-lists